Problem: It has been growing very rapidly and I am trying to figure out what parts are growing fast. Solution: This is not an easy question to answer. With the tools that are available computing an upper bound on the space that is utilitzed by table data is possible. Getting an exact amount of space usage is not currently possible, and getting an estimate usage of space is not practical to do. I'll explain these facts in the next paragraphs. When a record in a table is stored it is stored with references to a trail to all of the back versions of the record. When a record is stored there is also some RLE (run length encoding) of the record to save some space, so the record is stored with "mild" compression be done to it. In terms of the transaction being used one may be looking at the current record or a back version of the record. If we take a very nice case where a backup and restore has been done to the database, there are only current versions of a record. To go through each record and record the space taken up by the compressed record takes some time and expense (it is not an easy global calculation), while this is a judgement call it is not likely that there will be a return on an investment in calculating this disk usage. Add the elements of doing UPDATEs and DELETEs which produce back versions of a record and READ_COMMITTED and REPEATABLE_READ transactions, this space usage calculation becomes much more complex. When a record is stored on a data page it may be occupying only a portion of a page, or if it is large may spill over onto a second or third page and may occupy all or a portion of these pages. A record may also be placed on a page that contains data belonging to another record. If a record has back versions (or delta records) these records versions may be stored on data pages that contain data from other records. A record back version can also spill over to other data pages as well. In maintaining data pages, each data page has some management overhead space that will take away from some of the storage capacity of the page. A restore typically fills each page only up to 80% capacity to leave some space on the page for UPDATEs and DELETEs and for delta records. There is one final caveat in this situation on how data pages are used, this is when new data is INSERTed. New records are inserted first into empty spaces on existing data pages and then unallocated pages are converted to data pages before new data pages are allocated. As seen from the last paragraphs, obtaining the exact space usage by a table is next to impossible to calculate. What is possible is to obtain an upper bound for the space that is occupied. To get this data requires running the gstat command line utility, or by running the Database Statistics process in the Server Manager Utility. When running a Database Statistics go the the View menu and choose the Database Analysis choice. The data given here will show the number of data pages currently occupied per table, and the average fill percentage. When the average fill percentage starts to get relatively low with regard to the number of data pages used, this can used as a sign that it is time to backup and restore the database. Same applies to indexes. Some other things that can be done is to check that the OIT (oldest interesting transaction) is moving forward, or table data may be proliferating back versions of records. Another tactic that can be done if appropriate is to use the GBAK command line executable, there is a "use_all_space" parameter that will cause the data pages to be restored with a 100% fill capacity (if possible). Posted below is an excerpted portion of the Database Statistics to look at, here are the sections of the output that can be used to determine the upper bound of the data pages used by a table, and the index pages used by an index: EMPLOYEE (131) Primary pointer page: 270, Index root page: 271 Data pages: 5, data page slots: 5, average fill: 71% Fill distribution: 0 - 19% = 0 20 - 39% = 0 40 - 59% = 1 60 - 79% = 4 80 - 99% = 0 Index NAMEX (3) Depth: 1, leaf buckets: 1, nodes: 42 Average data length: 15.00, total dup: 0, max dup: 0 Fill distribution: 0 - 19% = 0 20 - 39% = 0 40 - 59% = 0 60 - 79% = 0 80 - 99% = 1 Index RDB$FOREIGN8 (1) Depth: 1, leaf buckets: 1, nodes: 42 Average data length: 0.00, total dup: 23, max dup: 4 Fill distribution: 0 - 19% = 0 20 - 39% = 1 40 - 59% = 0 60 - 79% = 0 80 - 99% = 0 Index RDB$FOREIGN9 (2) Depth: 1, leaf buckets: 1, nodes: 42 Average data length: 6.00, total dup: 15, max dup: 4 Fill distribution: 0 - 19% = 0 20 - 39% = 0 40 - 59% = 1 60 - 79% = 0 80 - 99% = 0 Index RDB$PRIMARY7 (0) Depth: 1, leaf buckets: 1, nodes: 42 Average data length: 1.00, total dup: 0, max dup: 0 Fill distribution: 0 - 19% = 0 20 - 39% = 1 40 - 59% = 0 60 - 79% = 0 80 - 99% = 0
Last Modified: 27-OCT-00