What are indexes and why are they used?

Answer

Indexes are database objects that improve the speed of data retrieval operations on a table. They work like an index in a book, providing a fast path to locate specific data without scanning the entire table.

How Indexes Work

Think of an index as a separate structure that contains:

  • Key values from one or more columns
  • Pointers to the actual rows in the table
  • Sorted organization for fast searching
-- Without index: Full table scan
SELECT * FROM employees WHERE last_name = 'Smith';
-- Database scans all 100,000 rows

-- With index on last_name: Index seek
CREATE INDEX idx_last_name ON employees(last_name);
SELECT * FROM employees WHERE last_name = 'Smith';
-- Database uses index to find rows quickly

Types of Indexes

1. Clustered Index

  • Physical ordering: Rows are stored in the same order as the index
  • One per table: Only one clustered index allowed
  • Primary key: Usually created automatically on primary key
-- Clustered index (usually on primary key)
CREATE CLUSTERED INDEX idx_emp_id ON employees(emp_id);

2. Non-Clustered Index

  • Logical ordering: Separate structure pointing to data rows
  • Multiple allowed: Can have many non-clustered indexes
  • Additional storage: Requires extra disk space
-- Non-clustered indexes
CREATE INDEX idx_last_name ON employees(last_name);
CREATE INDEX idx_department ON employees(department);
CREATE INDEX idx_hire_date ON employees(hire_date);

3. Unique Index

  • Enforces uniqueness: Prevents duplicate values
  • Automatic creation: Created automatically with UNIQUE constraints
-- Unique index
CREATE UNIQUE INDEX idx_email ON employees(email);

4. Composite Index

  • Multiple columns: Index on combination of columns
  • Column order matters: Most selective column should be first
-- Composite index
CREATE INDEX idx_dept_salary ON employees(department, salary);

-- Effective for queries like:
SELECT * FROM employees WHERE department = 'IT' AND salary > 50000;

Benefits of Indexes

1. Faster SELECT Operations

-- Without index: O(n) - linear scan
SELECT * FROM employees WHERE emp_id = 12345;

-- With index: O(log n) - binary search
-- Dramatically faster on large tables

2. Faster JOIN Operations

-- Indexes on join columns improve performance
SELECT e.name, d.dept_name
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id; -- Index on dept_id helps

3. Faster ORDER BY

-- Index on ordered column avoids sorting
SELECT * FROM employees ORDER BY last_name; -- Index on last_name helps

4. Faster GROUP BY

-- Index on grouped column improves performance
SELECT department, COUNT(*) 
FROM employees 
GROUP BY department; -- Index on department helps

Drawbacks of Indexes

1. Storage Overhead

-- Each index requires additional disk space
-- Rule of thumb: 10-20% of table size per index

2. Slower INSERT/UPDATE/DELETE

-- Every modification must update indexes too
INSERT INTO employees (name, department) VALUES ('John Doe', 'IT');
-- Must update table + all indexes on the table

3. Maintenance Overhead

-- Indexes need maintenance as data changes
-- Fragmentation can occur over time

When to Create Indexes

Good Candidates:

  • Frequently queried columns in WHERE clauses
  • JOIN columns used in table relationships
  • ORDER BY columns for sorting
  • Foreign key columns for referential integrity
-- Good index candidates
CREATE INDEX idx_customer_id ON orders(customer_id); -- FK column
CREATE INDEX idx_order_date ON orders(order_date);   -- Frequently filtered
CREATE INDEX idx_status ON orders(status);           -- WHERE clause column

Poor Candidates:

  • Small tables (< 1000 rows)
  • Frequently updated columns
  • Columns with low selectivity (few unique values)
-- Poor index candidates
-- CREATE INDEX idx_gender ON employees(gender);     -- Only M/F values
-- CREATE INDEX idx_active ON users(is_active);      -- Only true/false

Index Management

Creating Indexes

-- Basic index
CREATE INDEX idx_name ON table_name(column_name);

-- Composite index
CREATE INDEX idx_multi ON table_name(col1, col2, col3);

-- Unique index
CREATE UNIQUE INDEX idx_unique ON table_name(column_name);

-- Partial index (PostgreSQL)
CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;

Viewing Indexes

-- MySQL
SHOW INDEXES FROM table_name;

-- PostgreSQL
\d table_name

-- SQL Server
SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID('table_name');

Dropping Indexes

DROP INDEX idx_name ON table_name; -- MySQL
DROP INDEX idx_name;                -- PostgreSQL/SQL Server

Performance Examples

Before Index (Slow)

-- Table with 1 million employees
SELECT * FROM employees WHERE last_name = 'Smith';
-- Execution time: 2.5 seconds (full table scan)
-- Rows examined: 1,000,000

After Index (Fast)

CREATE INDEX idx_last_name ON employees(last_name);
SELECT * FROM employees WHERE last_name = 'Smith';
-- Execution time: 0.01 seconds (index seek)
-- Rows examined: 127 (only matching rows)

Index Design Best Practices

1. Column Order in Composite Indexes

-- Good: Most selective column first
CREATE INDEX idx_search ON products(category, brand, price);
-- Effective for: WHERE category = 'Electronics' AND brand = 'Apple'

-- Poor: Less selective column first
CREATE INDEX idx_poor ON products(price, category, brand);
-- Less effective for category-based searches

2. Include Columns (SQL Server)

-- Covering index with included columns
CREATE INDEX idx_covering ON employees(department) 
INCLUDE (name, salary, hire_date);
-- Query can be satisfied entirely from index

3. Filtered Indexes

-- Index only active records
CREATE INDEX idx_active_employees ON employees(last_name) 
WHERE status = 'ACTIVE';

Monitoring Index Usage

-- Check index usage statistics (SQL Server)
SELECT 
    i.name AS index_name,
    s.user_seeks,
    s.user_scans,
    s.user_lookups
FROM sys.indexes i
JOIN sys.dm_db_index_usage_stats s ON i.object_id = s.object_id;

Interview Tips

  • Understand that indexes trade storage space for query speed
  • Know the difference between clustered and non-clustered indexes
  • Be able to identify good and poor index candidates
  • Understand the impact on INSERT/UPDATE/DELETE operations
  • Know how to create composite indexes with proper column ordering

Test Your Knowledge

Take a quick quiz to test your understanding of this topic.

Test Your SQL Knowledge

Ready to put your skills to the test? Take our interactive SQL quiz and get instant feedback on your answers.