Articles Home |
Quick results are the point of this article, not optimum effect, here are quick and dirty database performance tuning methods. Oracle database tuning, Informix tuning, DB2 tuning, Postgres, MySql tuning? It does not matter what RDBMS you use these days as database performance tuning is all pretty much the same or at least this method of finding and fixing database performance problems is the same. I would use the this same approach for Sybase tuning and SQL Server tuning though I have never administered either product. RDBMS tuning is pretty much all the same because database software is pretty much the same these days. Any feature that is useful in one RDBMS is quickly copied by other database software and database tuning is pretty similar across the products. All RDBMS (Relational Database Management Systems) have an engine, an optimizer, SQL, run on some OS, an applications interface to the RDBMS, most have stored procedures and so they are more similar than different.
Database configuration, disk layout and the operating system are usually under the configuration management control of the database administrator and system administrator and this article addresses what you can control. Many applications are controlled by third party vendors and subject to processes that are outside the administrator's control so applications are not addressed in detail in this article.
Here are the heuristics listed in order of effort expended:
Switch your point of view to examine the whole system, not just the database. This may be the easiest and most effective tuning technique.
You may not have to tune the database at all. At one customer site, shipping labels were not generated fast enough; the multiple CPU Unix system was maxed for hours using all the I/O and CPU resources. "The database is too slow," the customer told me. "Go tune it." I looked at the database statistics and also the operating system statistics. That's when I noticed many processes spawning per minute from the Unix "at" command (a basic batch scheduling utility).
It was clear that the database was not the cause of this problem. The application scheduled a batch job for each shipping label with the "at" command, which had the Unix OS thrashing when it tried to generate and schedule a hundred thousand batch jobs to make labels. After changing a few lines in a shell script to generate the labels without the "at" utility the label generation took a tenth of the time. The database was not changed.
Look at net traffic, OS stats, DB stats, I/O stats. The system is a whole, so compare the stats together to get a view of what the system is doing over an interval of time. The statistics many times reflect the boundary between layers of software. I/O stats may show database activity as shown by the OS, CPU context switches may show how other processes affect database processess. It is the whole system that must perform well, not just the database, so make a well tuned system by taking a wide look at what is going on.
Keep in mind that the application is separate from the database and the database engine is separate from the OS it runs on and that OS connects to a network. This layer architecture is a way I think of the performance problems that tuning will fix. Many problems are where the layers interface or how well the system is integrated.
Make a list of down times, batch job failure, backup failure, slow performance, etc., in the network, OS, applications and the database. Look for patterns and relationships in the statistics and the failures. A system is a house of cards that needs a solid foundation. Lack of backups, lack of OS resources, network problems, a lack of security and poor configuration management may mask or compound any database performance problems. Tuning just the database is futile if the system crashes every hour. From the problem tracking list find the easiest things to fix with the most payoff. Usually there are a couple quick and effective things that can be done to get performance improvements.
Many times the quick and effective fixes may not be database problems but are a problem or mix of problems in other parts of the system or configuration management. Examples I have seen: OS file systems that are too full, shared memory misconfiguration, software integration problems between the OS, database and third party applications such as missing software patches or other dependencies, these can be seen a software layer interface problem. Lack of configuration management of software going into production systems is a problem that many overlook. Adding software to production systems without controls can degrade performance or stop systems altogether.
Again, check OS, network, application configurations and user configurations, not just the database configuration. Look up parameters that you do not understand. I have seen old configuration files from previous releases of software that used out of date parameters or lacked new parameters that make the software perform well. One system had new fast NIC cards set to run at an old low network speed, they forgot to add the new fast settings in the configuration file.
Read the release notes that come with the database software for OS configuration parameters on the hardware used and note the needed OS and application patches for the particular configuration being used. And and then make sure the database and OS match the release notes of the applications that are run on the system. This chain of dependencies can be crippling to performance if every link is not in place.
One basic improvement to performance in databases such as Postgresql, Oracle, DB2 and Informix is the cost based optimizers that generate query plans by looking at statistics from the data such as number of rows in a table and distributions of values. But the statistics used by the optimizers are not usually generated by default and many times I have seen no statistics jobs run in a database installation and poor performance is a result.
Check that the database optimizer statistics are run as recommended. Oracle "ANALYZE" (release 7), "DBMS_STATS" (8+), Informix "UPDATE STATISTICS", Postgesql "vacuum analyze" commands create the statistics used by the SQL optimizers to choose fast access methods to the data. Statistics only have to be run when the data changes enough to make a difference, running statistics jobs is resource intensive so remember to run statistics only when needed, the object is to improve and tune database performance not cripple it with excess statistics processes.
Most RDBMS collect statistics from the RDBMS process memory to monitor the database while it is running. These statistics are accessed with functions that are presented as views and can be accessed with SQL. With PostgreSQL, the monitoring statistics functions and views have information about the SQL statements run, the number of times a table or index is scanned, and more. Same with Oracle, DB2, Informix and the rest. Use these monitoring statistics to find the SQL statements that are the real system problems. Typically, plenty of SQL statements perform badly but only a few are the real resource problems; find those few that cause the most problem and put your effort into fixing those.
Start by identifying what SQL is run and why the SQL is run. Sometimes reports are run at the wrong times on an OLTP application system or a batch SQL job can be easily rescheduled to not conflict with another application that is using lots of resources on a server.
Identify the SQL statements that are slow and schedule these to run at times that do not crush the rest of the system. Run 'set explain on', (Informix), or 'set autotrace on;set timing on', (Oracle), to see what the optimizer is doing to SQL that you are checking.
Sometimes indexes are controlled by applications and cannot be administered easily. But usually there is no problem manipulating the indexing. The first thing is to check that indexes are set up for slow SQL statements, many times there is no index used by the problem SQL statement because the application or data distribution has changed but not indexing on the tables affected. I have seen magnitudes of 10 improvement in performance with the addition of a good index or adding an attribute to an index.
There are lots of quick fixes that can be checked for indexes. Does the index contain an attribute which has nulls in the domain? Then the index will not be as effective in comparisons, maybe a default value can be used to get rid of the nulls. Data attributes that have a small domain, uneven distribution of values, and nulls are not usually good choices for attributes of an index.
Here is an Oracle script to check the values of index attributes to evaluate performance. Adding an attribute with a larger domain of values or dropping attributes that are not useful or even dropping an index to reduce overhead are all options. Sometimes indexes need rebuilding if they are in tables that have changed a lot over an interval of time.
There are different types of indexes and index organization to affect performance problems. Bitmap indexes can be used for attributes with low cardinality especially if there is an uneven distribution across the attribute domain. Because of costly updates they are mostly used in tables that do not get updated but are read only such as a database warehouse application. I have had problems with the Oracle 7.3.4 optimizer recognizing a bitmap index without a hint which makes it almost worthless without changing the code in an application that may not be under the control of a database administrator.
If you have multiple CPUs and many disks some degree of parallelism can be easily configured, but it is too much detail for each database to cover here. Check your manuals, environment variables and configuration files for both sorting and SQL parallelism. I have seen many database installations where there were easy to use parallelism features that noticeably improved performance after a couple line configuration file change and a restart. Some parallelism features can make performance problems though, carefully evaluate them and check that a particular parallel feature can be turned on for a particular time or set of transactions and turned off at other times.
Table partitioning is a method of storing tables in separate parts of the disk to reduce the size of the data that needs to be accessed. Analysis of SQL that accesses large tables may reveal a table or index partition scheme that can get rid of a bottleneck by shrinking access to a small part of the table, the rest is partitioned into rarely accessed pieces that are basically dormant. An example is partitioning a table on a "date" attribute for queries that access data by date, if the table partition is by year or month and the usual query looks for recent dates then the rest of the table is not searched. I have seen a speed up of response by magnitudes of 10 for specific SQL statements using this technique. The more developed RDBMS products have several table and index partitioning methods. There is a cost to administer more partitions so evaluate this technique carefully.
- If you have a smaller set of disks then do the traditional database disk configuration. Separate the system catalogs from the various types of logs in different database partitions. (Informix- logical and physical logs,) (Oracle- rollback, redo and archive logs,) there are similar schemes for all the rest of the types of RDBMS software. Separate very active tables from each other. Separate indexes from tables. Table and index partitioning over multiple disks will help parallelism and i/o performance, however, disk reorganization is a lot of work. Separate disk space for the database logs is probably the most payoff, followed by the most active tables, but this move should be researched carefully to assess a high enough return on the investment.
Help the developers to stop future problems by getting involved in design and code reviews. Look for such things as many connections made and terminated to the database over a network when one connection can be made, or a connection pooling scheme could be used, lack of database error trapping, huge joins with gobs of I/O that are sent over networks to fat clients for additional processing. Show the developers how to analyze SQL for performance and avoid long transactions.
Systems change. Data changes, software is upgraded, new hardware and applications are added and old parts of the system are removed. A whole system approach to RDBMS tuning and administration helps you be aware of the changes, calculate the risks and anticipate problems.
An older version of this article: Quick and Dirty RDBMS Tuning is on O'Reilly's Onlamp.com site.