Level: Introductory David J. Kline, PartnerWorld for Developers Technical Support Specialist, IBM Gabor Wieser (gaborw@us.ibm.com), PartnerWorld for Developers Technical Support Specialist, IBM
19 Dec 2002 This article helps the new DB2 DBA understand the importance of table spaces and bufferpools, and explains how properly designing and tuning them can enhance database performance.
© 2002 International Business Machines Corporation. All rights reserved.
This article is written for IBM DB2 Universal DatabaseTM for Linux, UNIX®, and Windows ®.
Introduction
For the DBA who is just stepping into the world of DB2 or for the prospective DBA, the design and performance choices for a new database can be very confusing. In this article we will discuss a couple of areas where the DBA has important choices to make: table spaces and buffer pools. The design and tuning of your table spaces and buffer pools can have a profound impact on how your DB2 server performs, so we will focus our attention on those activities.
In our examples, we will be using DB2 version 8.1, Enterprise Server Edition. Most of the examples also apply to down-level versions. We will let you know if an example applies only to version 8.1.
In Section 1, we will start off by defining the types of table spaces and will explain how DB2 stores data in table spaces. We'll cover configuration options and take you through the process of creating and managing a table space. Next, we focus on buffer pools, covering what a buffer pool is and how to create and use it. In Section 2, we will bring these two areas together and discuss how buffer pools and table spaces should be organized to maximize performance.
Section 1: Definitions
Table Spaces
All data for a database is stored in a number of table spaces. You can think of a table space as being a child and a database as its parent, where the table space (child) cannot have more than one database (parent). Because there are different uses for table spaces, they are classified according to their usage and how they will be managed. There are five different table spaces by usage:
- Catalog table space
- There is only one catalog table space per database, and it is created when the CREATE DATABASE command is issued. Named SYSCATSPACE by DB2, the catalog table space holds the system catalog tables. This table space is always created when the database is created.
- Regular table spaces
- Regular table spaces hold table data and indexes. It can also hold long data such as Large Objects (LOBs) unless they are explicitly stored in long table spaces. A table and its indexes can be segregated into separate regular table spaces, if the table spaces are database managed space (DMS). We will define the differences between DMS and system managed space (SMS) later in this article. At least one regular table space must exist for each database. The default is named USERSPACE1 when the database is created.
- Long table spaces
- Long table spaces are used to store long or LOB table columns and must reside in DMS table spaces. They can also store structured type columns or index data. If no long table space is defined, then LOBs will be stored in regular table spaces. Long table spaces are optional and none will be created by default.
- System temporary table spaces
- System temporary table spaces are used to store internal temporary data required during SQL operations such as sorting, reorganizing tables, creating indexes, and joining tables. At least one must exist per database. The default created with the database is named TEMPSPACE1.
- User temporary table spaces
- User temporary table spaces store declared global
temporary tables. No user temporary table spaces exist when
a database is created. At least one user temporary table
space should be created to allow definition of declared
temporary tables. User temporary table spaces are optional
and none will be created by default.
Table space management
Table spaces can be managed two different ways:
- System managed space (SMS)
- SMS table spaces are managed by the operating system. Containers are defined as regular operating system files and they are accessed via operating system calls. This means that all the regular operating system functions will handle the following: I/O will be buffered by the operating system, space will be allocated according to the operating system conventions, and the table space is automatically extended when it is necessary. However, containers cannot be dropped from SMS table spaces, and adding new ones is restricted to partitioned databases. The three default table spaces explained in the previous section are SMS.
- Database managed space (DMS)
- DMS table spaces are managed by DB2. Containers can be
defined either as files (which will be fully allocated with
the size given when the table space is created) or devices.
DB2 will manage as much of the I/O as the allocation method
and the operating system will allow. Extending the
containers is possible by using the ALTER TABLESPACE
command. Unused portions of DMS containers can be also
released (starting with version 8).
Here is an example that shows how you can increase container sizes (supported with both version 7 and version 8):
ALTER TABLESPACE TS1 RESIZE (FILE '/conts/cont0' 2000, DEVICE '/dev/rcont1' 2000, FILE 'cont2' 2000) |
Note that resizing an original container to a smaller size is only supported with version 8.
How to create and view your table spaces
When you create a database, three table spaces will be created (SYSCATSPACE, TEMPSPACE1, and USERSPACE1). Using the DB2 Command Window or the UNIX command line, create a database called testdb, connect to it, and list the table spaces:
CREATE DATABASE testdb CONNECT TO testdb LIST TABLESPACES |
Listing 1 below shows the output from the LIST TABLESPACES command.
Listing 1. Output from LIST TABLESPACES command
Tablespaces for Current Database
Tablespace ID = 0 Name = SYSCATSPACE Type = System
managed space Contents = Any data State = 0x0000
Detailed explanation: Normal
Tablespace ID = 1 Name = TEMPSPACE1 Type = System
managed space Contents = System Temporary data State =
0x0000 Detailed explanation: Normal
Tablespace ID = 2 Name = USERSPACE1 Type = System
managed space Contents = Any data State = 0x0000
Detailed explanation: Normal
|
The three table spaces shown above are automatically created by the CREATE DATABASE command. The user can override the default table space creation by including table space specifications in the command, but a catalog table space and at least one regular and one system temporary table space must be created at database creation time. More table spaces of all types (except catalog table space) can be created either with the CREATE DATABASE command, or later using the CREATE TABLESPACE command.
Containers
Every table space has one or more containers. Again, you might think of a container as being a child and a table space as its parent. Each container can only belong to a single table space but a table space can have many containers. Containers can be added to or dropped from a DMS table space, and their sizes can be modified. Containers can only be added to SMS table spaces on partitioned databases in a partition, which does not yet have a container allocated for the table space. When new containers are added, an automatic rebalancing will start to distribute the data across all containers. Rebalancing will not prevent concurrent access to the database.
Table space settings
There are a number of settings that can be specified for table spaces, either when they are created, or later with an ALTER TABLESPACE statement.
- Page size
- Defines the size of pages used for the table space. Sizes supported are 4K, 8K, 16K and 32K. The page size limits the row length and column count of tables that can be placed in the table space according to the following table:
Table 1. Implications of page size
|
Page Size
|
Row Size Limit
|
Column Count Limit
|
Maximum capacity
| | 4 KB | 4 005 | 500 | 64 GB | | 8 KB | 8 101 | 1 012 | 128 GB | | 16 KB | 16 293 | 1 012 | 256 GB | | 32 KB | 32 677 | 1 012 | 512 GB |
Table spaces are limited to 16384 pages, so choosing a larger page size will increase the capacity of the table space.
- Extent size
- Specifies the number of pages that will be written to a container before skipping to the next container. The database manager cycles repeatedly through the containers as data is stored. This parameter has effect only when there are multiple containers for the table space.
- Prefetch size
- Specifies the number of pages that will be read from the table space when data prefetching is being performed. Prefetching reads in data needed by a query prior to its being referenced by the query so that the query need not wait for I/O to be performed. Prefetching is selected by the database manager when it determines that sequential I/O is appropriate and that prefetching may help to improve performance.
- Overhead and transfer rate
- These values are used to determine the cost of I/O
during query optimization. Both values are measured in
milliseconds and they should be the average for all
containers. The overhead is the time associated with I/O
controller activity, disk seek time and rotational latency.
The transfer rate is the amount of time necessary to read
one page into memory. The default values are 24.1 and 0.9,
respectively. These values can be calculated based on
hardware specifications.
Example of a CREATE TABLESPACE statement
The following statement will create a regular table space. All of the settings discussed are shown for illustration.
CREATE TABLESPACE USERSPACE3
PAGESIZE 8K
MANAGED BY SYSTEM
USING ('d:\usp3_cont1', 'e:\usp3_cont2', 'f:\usp3_cont3')
EXTENTSIZE 64
PREFETCHSIZE 32
BUFFERPOOL BP3
OVERHEAD 24.1
TRANSFERRATE 0.9 |
How to view your table space attributes and containers
Specifying the SHOW DETAIL option of the LIST TABLESPACES command will show additional information:
LIST TABLESPACES SHOW DETAIL |
Listing 2
shows the output for the USERSPACE1 table space. By default, the three table spaces created at database creation time will be listed.
Listing 2. Output from LlST TABLESPACES SHOW DETAIL command
Tablespaces for Current Database
Tablespace ID = 2 Name = USERSPACE1 Type = System
managed space Contents = Any data State = 0x0000
Detailed explanation: Normal Total pages = 336 Useable
pages = 336 Used pages = 336 Free pages = Not
applicable High water mark (pages) = Not applicable
Page size (bytes) = 4096 Extent size (pages) = 32
Prefetch size (pages) = 16 Number of containers = 1
|
To list the containers we need to use the Tablespace id from the output above:
LIST TABLESPACE CONTAINERS FOR 2 |
Listing 3. Output from LIST TABLESPACE CONTAINERS command
Tablespace Containers for Tablespace 2
Container ID = 0 Name =
C:\DB2\NODE0000\SQL00004\SQLT0002.0 Type = Path
|
The command will list all containers for the specified table space. The path, shown above, points to where the container physically resides.
Buffer Pools
A buffer pool is associated with a single database and can be used by more than one table space. When considering a buffer pool for one or more table spaces, you must ensure that the table space page size and the buffer pool page size are the same for all table spaces that the buffer pool services. A table space can only use one buffer pool.
When the database is created, a default buffer pool named IBMDEFAULTBP is created which is shared by all table spaces. More buffer pools can be added by using the CREATE BUFFERPOOL statement. The buffer pool size defaults to the size specified by the BUFFPAGE database configuration parameter but can be overridden by specifying the SIZE keyword in the CREATE BUFFERPOOL command. Adequate buffer pool size is essential to good database performance since it will reduce disk I/O, the most time consuming operation. Large buffer pools will also have an effect on query optimization, since more of the work can be done in memory.
- Block-based buffer pools
- Version 8 allows you to set aside a portion of the buffer pool (up to 98%) for block-based prefetching. Block-based I/O will improve the efficiency of prefetching by reading a block into a contiguous area of memory instead of scatter loading it into separate pages. The size of the blocks must be uniform per buffer pool and is controlled by the BLOCKSIZE parameter. The value is the size of block in pages, from 2 to 256, the default being 32.
- Extended storage
- DB2 does not use extended storage for buffers. However, extended storage can be used to cache memory pages, making it faster to move the pages out from memory.
Example of CREATE BUFFERPOOL statement
Here is an example of the CREATE BUFFERPOOL statement:
CREATE BUFFERPOOL BP3 SIZE 2000 PAGESIZE 8K |
This buffer pool is assigned to USERSPACE3 on the CREATE TABLESPACE example above and is created before creating the table space. Note that the page sizes of 8K for the buffer pool and table space are the same. If you create the table space after creating the buffer pool, you can leave out the BUFFER POOL BP3 syntax in the CREATE TABLESPACE statement. Instead, you can use the ALTER TABLESPACE command to add the buffer pool to the existing table space:
ALTER TABLESPACE USERSPACE3 BUFFERPOOL BP3 |
How to view your buffer pool attributes
You may list buffer pool information by querying the SYSCAT.BUFFERPOOLS system view:
SELECT * FROM SYSCAT.BUFFERPOOLS
BPNAME BUFFERPOOLID NGNAME NPAGES PAGESIZE ES
------------------ ------------ ------------------
----------- ----------- -- IBMDEFAULTBP 1 - 250 4096 N
1 record(s) selected.
|
To find out which buffer pool is assigned to table spaces, run the following query:
SELECT TBSPACE, BUFFERPOOLID FROM
SYSCAT.TABLESPACES
TBSPACE BUFFERPOOLID ------------------ ------------
SYSCATSPACE 1 TEMPSPACE1 1 USERSPACE1 1
3 record(s) selected.
|
The BUFFERPOOLID can be found in the previous query, enabling you to see which buffer pool is associated with each table space.
Visual diagram of how a database holds table spaces
Now that we have described what a table space and buffer pool is and how to create them, let's take a look at an example of how they are visually organized within a database.
Figure 1. Table spaces and buffer pools
This database has five table spaces: a catalog, two regular, a long, and a system temporary table space. No user temporary table space was created. There are eight containers.
In this scenario buffer pools might be assigned as follows:
BP1 (4K) to SYSCATSPACE and USERSPACE2
BP2 (8K) to USERSPACE1
BP3 (32K) to LARGESPACE and SYSTEMP1
Section 2: Performance implications
In general, when designing table space and container placement on physical devices, the goal is to maximize I/O parallelism and buffer utilization. To achieve that goal, you should have a thorough understanding of the database design and applications. Only then can you determine such issues as whether or not segregating two tables to different devices will lead to parallel I/O, or whether or not a table should be created in a separate table space, so it can be fully buffered.
Designing the physical layout of a new database should start with designing the table space organization:
- The first step is determining the constraints given by the table designs. These might result in having to use more than one regular table space.
- The second step is to consider if having the tables in table spaces with different settings is likely to significantly increase performance.
- Once a tentative table space design is made, then buffer pool utilization will have to be considered. This might lead to some changes to the previous table space design.
- Finally, the table spaces must have containers allocated to them.
This process is iterative, and the design should be verified with stress testing and benchmarking. It is obvious that arriving at the optimum design can be quite an intensive effort, and can only be justified if the database performance must be the best possible. As a rule:
- Start out with the simplest feasible design.
- Add complexity only when there is a sufficient performance justification for it based on testing.
Often a slight degradation of performance is well worth the reduced complexity of administering and maintaining a simpler database design. DB2 has a very sophisticated resource management logic, and usually produces very good performance without elaborate design.
Table space organization
The catalog table space and system temporary table spaces should usually be allocated as SMS. There is no reason to have more than one temporary table space of the same page size, and usually one with the largest page size will be sufficient.
The salient question is whether to split up the user data into multiple table spaces or not. One consideration is the utilization of pages. Rows cannot be split between pages, so tables with long rows require the appropriate page size. However, there cannot be more than 255 rows on a page, so tables with short rows will not utilize the whole page. For example a table with a row length of 12 bytes placed in a table space with 32K page size will utilize only about 10% of each page (i.e. ((255 rows * 12 bytes) + 91 bytes of overhead) / 32k page size = ~10%).
This is only a consideration if the table is large, and thus the wasted space is significant. It will also make I/O and buffering less efficient, since the actual useful content of each page will be small. If a table can be placed into a smaller page size table space as well as fully utilizing a larger page size, then the most frequent method of access will determine which one is better. If typically a great number of rows are accessed sequentially (maybe the table is clustered), then the larger page size will be more efficient. If rows are accessed randomly, then the smaller page size will allow DB2 to make better use of the buffer, since more pages will fit into the same storage area.
Once the tables have been grouped by page size, then access frequency and type will determine if further grouping the data into separate table spaces will make sense. Each table, depending on how it is accessed most frequently, will have a most efficient set of table space settings: PAGESIZE, EXTENTSIZE, and PREFETCHSIZE. PAGESIZE was addressed above. EXTENTSIZE is the number of pages of data that will be written to a container before writing to the next container (if multiple containers exist in the table space).
PREFETCHSIZE specifies the number of pages that will be read from the table space when data prefetching is being performed. Prefetching is used when the Database Manager determines that sequential I/O is appropriate and that prefetching may help to improve performance (typically large table scans). It is a good practice to explicitly set the PREFETCHSIZE value as a multiple of the EXTENTSIZE value for your table space and the number of table space containers. For example, if the EXTENTSIZE is 32 and there are four containers, then good PREFETCHSIZEs would be 128, 256, etc. If one or more heavily used tables require a different set of these parameters from the values which are the best for the rest of the table space performance, then putting these tables into a separate table space might improve overall performance.
If prefetching is an important factor in a table space, then consider setting aside part of the buffer for block-based I/O. The block size should be equal to the PREFETCHSIZE.
Buffer pool utilization
The most important reason to use more than one user table space is to manage buffer utilization. A table space can only be associated with one buffer pool, though one buffer pool can be used for more than one table space.
The goal of buffer pool tuning is to help DB2 make the best possible use of the memory available for buffers. The overall buffer size has a great effect on DB2 performance, since a large number of pages can significantly reduce I/O, which is the most time consuming operation. However, if the total buffer size is too large, and there is not enough storage to allocate them, then a minimum buffer pool for each page size will be allocated, and performance will be sharply reduced. To calculate the maximum buffer size, all other storage utilization must be considered by DB2 as well as the operating system and any other applications. Once the total available size is determined, this area can be divided into different buffer pools to improve utilization. If there are table spaces with different page sizes, then there must be at least one buffer pool per page size.
Having more than one buffer pool can preserve data in the buffers. For example, let's suppose that a database has many very frequently used small tables, which would normally be in the buffer in their entirety and thus would be accessible very fast. Now let's suppose that there is a query which runs against a very large table, which uses the same buffer pool and involves reading more pages than the total buffer size. When this query runs, the pages from the small, very frequently used tables will be lost, making it necessary to re-read them when they are needed again.
If the small tables have their own buffer pool, thereby making it necessary for them to have their own table space, their pages cannot be overwritten by the large query. This will likely lead to a better overall system performance, albeit at the price of some small negative effect on the large query. Very often tuning is a trade-off between different functions of a system to achieve an overall performance gain. It is essential to prioritize functions and bear total throughput and usage in mind, while making adjustments to the performance of a system.
A new feature introduced with v8 is the ability to change buffer pool sizes without shutting down the database. The ALTER BUFFERPOOL statement with the IMMEDIATE option will take effect right away, except when there is not enough reserved space in the database-shared memory to allocate new space. This feature can be used to tune database performance according to periodical changes in use, for example switching from daytime interactive use to nighttime batch work.
Physical storage organization
Once tables are distributed among table spaces, their physical storage will have to be decided. A table space can be stored in multiple containers and can be either SMS or DMS. SMS is easier to administer and might be a good choice for table spaces containing many small, diverse tables (for example the catalog table space), especially if these tables contain LOBs. To reduce the overhead of extending the SMS containers one page at a time, the db2empfa
command should be run. This will set the value of the database configuration parameter MULTIPAGE_ALLOC to YES.
DMS usually has a better performance and provides the flexibility of storing indexes and LOB data separately. Multiple containers for a table space should typically be placed on separate physical volumes. This can improve the parallelism of some I/Os. When there are multiple user table spaces and multiple devices, you should consider the application logic, so workload will be distributed as evenly as possible among these devices.
RAID devices have their own special considerations. EXTENTSIZE should be equal to, or a multiple of the RAID stripe size. PREFETCHSIZE should be the RAID stripe size multiplied by the number of RAID parallel devices (or a multiple of this product), and a multiple of the EXTENTSIZE. DB2 comes with its own registry variables allowing you to enhance your specific environment. By performing the following command, I/O parallelism will be enabled within a single container:
Another registry variable, DB2_STRIPED_CONTAINERS=ON, will change the container tag size from one page to a full extent thus making it possible to line up the table space extents with the RAID stripes.
As in other areas of performance evaluation, the only sure way to know if a change has a beneficial effect is to conduct benchmarks. It is somewhat more complicated to perform in case of physical organization changes, because a comparatively large amount of effort is necessary to change table spaces. The best practical way is to reduce the number of cases during the design phase, so fewer cases need to be benchmarked later. The only occasion when it is worth spending the time and energy to rigorously benchmark competing designs is when performance is of very high importance, and there is a likelihood of significant performance difference between differing designs. Emphasis should be placed on buffer pools, making sure that they are not allocated in virtual memory and are utilized in the most efficient manner.
Considerations for moving databases
Tuning parameters and physical organization of the database should always be re-evaluated before moving it to a different system, even when these systems are the same kind of platform. In a real life situation, a DBA copied a well-tuned database from a Windows server with one GB of storage to a laptop with 256 MB storage. Connection, which was subsecond on the server, took 45 minutes. The problem was resolved by reducing the buffer pool size and other memory parameters.
The question becomes even more difficult if the platforms are different. Even between UNIX and Windows, what is optimal on one system might not be on the other. If the database copy is intended for production, then the tuning process should be repeated. If the database has to be moved to a zSeriesTM, some of this discussion does not apply, and the appropriate manuals and Redbooks should be consulted. On an iSeries system, physical setup and tuning is done altogether outside of the database environment, and the iSeriesTM system management manuals should be consulted.
Conclusion
We covered quite a bit of material in this article and it is by no means everything you should know about database design and performance. We focused on a couple of the bigger issues of database design without getting into details of query optimization and application considerations. Designing your database is first and foremost since everything else will be layered on top, so your initial planning should be comprehensive. For your convenience, we have provided other online references below so you can continue your education on this topic.
Resources
SQL Reference:
Other Table Space and Buffer Pool References::
Articles from the DB2 Developer Domain:
If you would like to learn more about PartnerWorld for Developers, please visit
http://www.developer.ibm.com/.
About the authors  | |  | David Kline works as a DB2 Technical Support representative for PartnerWorld for Developers. Along with 10 other team members, David helps Independent Software Vendors (ISVs) solve a wide range of development and administration issues. David has DB2 certifications in both application development and administration. He focuses most of his time helping ISVs with DBA (Database Administration) related problems. If you would like to learn more about PartnerWorld for Developers, please visit http://www.developer.ibm.com/. |
 | |  | Gabor Wieser has been in the information technology business for nearly thirty years. Last year he joined the PartnerWorld for Developers organization to provide technical services and support for DB2 developers. Gabor has worked as a mainframe system programmer and later as an application designer. In the last ten years he has designed, maintained and tuned databases on various platforms. Currently he is supporting developers using DB2 UDB on Windows/UNIX and DB2/400 on the iSeries. His main areas of expertise are database administration, extenders and system interconnectivity. Gabor can be reached at gaborw@us.ibm.com
. |
Rate this page
|