Database Programming :: Lessons :: Query Optimization
An index is a sorted table that can be created for a field to make SELECT queries quicker. It leads to slower INSERT and UPDATE queries, however, since the index needs to be created every time there is an update to the source data. The index also adds to the size of the database. Most of the time these trade offs are worth it. Below are the guidelines for deciding whether or not a field should have an index:
- The primary key is automatically indexed
- Columns used in WHERE clauses should be indexed
- Columns used in aggregate functions such as min and max should be indexed
- Columns used in GROUP BY or ORDER BY should be indexed
- Columns other than the primary key that are unique should be indexed
Indexes can also be added to multiple fields, which is helpful if fields are used in conjunction a lot, such as last name and first name fields.
The EXPLAIN command can tell you how well queries are using indexes and can help you improve your queries.
EXPLAIN SELECT * FROM `Director` WHERE `DirectorLast` = 'Spielberg'
The above query does not actually run the query, but instead produces a table like the following:
The type of the query is the join type, which can be one of the following, ranked from fastest to slowest:
- system: The table is a system table with only one row.
- const: The table has at most one matching row, can be read once, and is treated as a constant for the remainder of the query.
- eq_ref: No more than one row from this table will be read for each combination of rows from the previous table.
- ref: All matching rows from this table will be read for each combination of rows from the previous table.
- range: Only rows in a given range in the table will be retrieved using an index.
- index: A full scan of the index will be performed for each combination of rows from previous tables.
- ALL: A full scan of the table will be performed for each combination of rows from previous tables.
The possible_keys result lists any fields that could serve as keys to improve performance.
Adding an index to the DirectorLast field in the above example will result in the following EXPLAIN table:
|1||SIMPLE||Director||ref||DirectorLast||DirectorLast||37||const||1||Using index condition|
The ref result specifies which columns or constants are used to select rows from a table and the rows result specifies how many rows might need to be examined to execute the query.
The LIMIT clause can be added to a query to limit the number of results it will return. If you only need the first 5 directors from the director table, it doesn't make sense to output all the table contents. You could do the following:
SELECT * FROM `Director` ORDER BY `DirectorLast` LIMIT 5