Skip to main content

skip to main content

developerWorks  >  Information Management  >

Allocating for High-Velocity Inserts on DB2 UDB for iSeries

developerWorks
Document options

Document options requiring JavaScript are not displayed


Learn and share!

Exchange know-how with your peers -- try our new Pass It Along beta app


Rate this page

Help us improve this content


Level: Introductory

Kent MilliganIBM Rochester

01 Mar 2002

Is your application heavy on inserts? Read these tips on how you can improve the performance of those applications in concurrent environments by preallocating storage. A sample SQL stored procedure is included.

Introduction

When you have lots of concurrent activity on your database, there are many bottlenecks that could slow the performance of the database engine. The most common bottlenecks that I see are:

  • Locking and serialization conflicts.
  • A "hot" disk drive; that is, when the most popular rows accessed by applications happen to all be physically stored on the same disk drive.

The purpose of this article, however, is to examine a less common source for bottlenecks: storage allocation for tables.



Back to top


A case for preallocating storage

One of the attributes that has made DB2® UDB for iSeries® easy to use and manage over the years is that DB2 automatically handles all of the low-level storage allocation for the DB2 objects. All you have to do is create the table, and let DB2 automatically handle allocating disk storage and spreading the table data evenly over the disk drives. This automatic storage allocation and management works fine for the majority of iSeries and AS/400e® customers. However, there is one environment in which performance can be improved by preallocating the storage for a DB2 table.

An example of an environment that can benefit from preallocated storage is when you have a table in which new rows are being inserted at high rates by multiples connections and jobs -- especially in batch processing environments. Consider preallocating storage for the rows in that table when you have a rough idea on what the maximum row count for the table will be. The reason for preallocating the table storage is so that DB2 can insert new rows into the table during heavy, concurrent loads without having to periodically allocate storage for new rows, which interruprts the insertion process. If many connections and jobs are allocating new space for a DB2 table at the same time, then the queue for storage allocation can quickly become a bottleneck.



Back to top


How to do it

After you've identified a table that has a high-velocity insert rate in concurrent environment, and after you determine approximately what the maximum row count will be, then you can use the CHGPF OS/400® system command to preallocate storage for the table. The good news is that there's a command for preallocating the disk storage; the bad news for SQL-based applications is that there is not an SQL interface available for allocating the table storage. However, Figure 1 shows an SQL stored procedure, which I call allocate_rows, that can be invoked from an SQL script after you've created the table to preallocate row storage for the newly created table or to allocate additional row storage for an existing table.



Back to top


The allocate_rows stored procedure

The allocate_rows stored procedure requires three input parameter names:

  • tablename, which is the name of the table for which storage is to be allocated.
  • libname, which is the schema or collection name.
  • rowcount, which is the number of rows for which storage is to be allocated.

The stored procedure uses these parameters to dynamically construct a CHGPF request for the specified tables. Here's an example of the command that is constructed:

CHGPF FILE(MYSCHEMA/MYTABLE) SIZE(50000 1000 3) ALLOCATE(*YES)

This command allocates storage for 50000 rows in the specified table, MYTABLE. The SIZE parameter contains 3 different numeric values:

  • The first numeric value (50000) is the number of rows initially allocated for the table.
  • The second parameter contains the number of rows that will automatically be added to the table if the initial row allocation is exceeded.
  • The third value is the number of times that additional rows will be allocated.

In the example shown above, storage for 50000 rows is allocated initially, and the table can support a maximum of 53000 rows (50000 + (3 x 1000)), with 1000 rows being allocated each time the row allocation limit is reached, up to 3 times. This is the main reason why you can only preallocate table storage effectively for those tables in which there is some idea of the maximum number of rows.

The values for the number of additional rows and the number of times that additional rows can be allocated are set to 1000 and 3 in the stored procedure shown in Figure 1, but you can change those values to meet your requirements.

The allocate_rows stored procedure uses a system-provided stored procedure, QCMDEXC, to execute the CHGPF system command from an SQL procedure. The QCMDEXC stored procedure requires two input parameters:

  • A character string that contains the system command.
  • The character length of that system command.

After the CHGPF command has been executed via the QCMDEXC stored procedure, then further action is required to actually activate the specified change. The stored procedure uses either the RGZPFM or CLRPFM system command on the specified table, depending on whether the table already contains rows. The CLRPFM command deletes all of the rows in the specified table, which is why the stored procedure first runs a query to check if there are any rows in the input table. If the table already contains data, then the RGZPFM command is used to reorganize the table to activate the change in allocation. The RGZPFM command can take a very long time to run for an existing table with a large number of rows, so use this stored procedure carefully with tables that already contain large amounts of data.


Figure 1. SQL procedure to preallocate storage
				
create procedure
  allocate_rows (tablename varchar(256), libname char(10), rowcnt integer )
language sql
begin
  DECLARE qualified_name VARCHAR(20);
  DECLARE current_count INTEGER;
  DECLARE count_stmt VARCHAR(64);
  DECLARE change_cmd VARCHAR(128);
  DECLARE activate_tblchg_cmd VARCHAR(64) ;
  DECLARE cmd_length NUMERIC(15,5);
  DECLARE c1 CURSOR FOR s1;

  SELECT   strip(system_table_schema) || '/' || strip(system_table_name)
                INTO qualified_name
      FROM qsys2.systables
 	 WHERE table_name=tablename  and table_schema=libname;

/* Execute the CHGPF command to change the table allocation */
SET change_cmd = 'CHGPF FILE('|| qualified_name  || ') ' ||
                    'SIZE(' || CHAR(rowcnt) || '1000 3) ALLOCATE(*YES)';
SET cmd_length=LENGTH(change_cmd);
Call QSYS.QCMDEXC (change_cmd,cmd_length);

/* Need to determine the number of rows in the table to activate
this new table allocation */
SET count_stmt = 'SELECT COUNT(*) FROM ' || libname || '.' || tablename ;
PREPARE S1 FROM count_stmt;
OPEN c1;
FETCH c1 INTO current_count;
CLOSE c1;

/* Use the current number of rows to determine the "activation"command */
IF current_count=0 THEN

    /* No rows in the table means a clear operation can be used to get the
    new ALLOCATE settings to take effect */

    SET activate_tblchg_cmd='CLRPFM ' || qualified_name;
ELSE
    /* If there are rows, they cannot be deleted so reorganize instead */
    SET activate_tblchg_cmd='RGZPFM ' || qualified_name;
END IF;
SET cmd_length = LENGTH (activate_tblchg_cmd);
/* Activate the new table allocation values by issuing a clear or reorganize */
Call QSYS.QCMDEXC (activate_tblchg_cmd,cmd_length);

END;
			



Back to top


Changing node size for indexes and keyed logical files

Another low-level change that can improve performance in high-concurrency environment is to change the node size in the underlying tree used by DB2 to implement indexes and keyed logical files. By default, SQL indexes are created with the larger node size that improves concurrency. However, there are many older keyed logical files with the smaller node size. Again, this change cannot be done with SQL; the change must be done with the CHGLF command. This same stored procedure design could be reused to build a stored procedure that issues the CHGLF command to change the node structure for an existing index. Here's an example of the command needed to update an index to use the larger node size that's better suited for concurrent environments.

CHGLF FILE(MYSCHEMA/MYINDEX) FRCRBDAP(*YES) ACCPTHSIZ(*MAX1TB)



Back to top


Summary

I've described the situation in which performance can be improved by preallocating storage, namely, high-velocity inserts in a concurrent environment. I've also shown you how you can create an SQL stored procedure to do this task when an OS/400 command interface cannot be used.



About the author

Photo: Kent Milligan

Kent Milligan is a DB2 UDB Technology Specialist in PartnerWorld for Developers, iSeries. Kent spent the first eight years of his IBM career as a member of the DB2 development group in Rochester. He speaks and writes regularly on various iSeries and AS/400e relational database topics. You can reach him at kmill@us.ibm.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top