IMS DB refresher for your super success in interviews. This tutorial contains all key points in IMS DB. Many interviewers follow or touch these questions to select IMS DB developers
IMS DB Refresher
You can read my other article on Hierarchical database Vs RDBMS.
- Hierarchical database architecture widely used in IBM mainframes.
- Data is arranged logically in a top-down format. Data is grouped in records, which are subdivided into a series of segments.
- The structure of the database is designed to reflect logical dependencies
- Certain data is dependent on the existence of certain other data.
IMS database organization
The nine types of databases supported by IMS DB can be grouped by their IMS access method.
Hierarchic Sequential Databases
The earliest IMS database organization types were based on sequential storage and access of database segments.
The root and dependent segments of a record are related by physical adjacency. Access to dependent segments is always sequential. Deleted dependent segments are not physically removed but are marked as deleted. Hierarchic sequential databases can be stored on tape or DASD.
ecord are related by physical adjacency. Access to dependent segments is always sequential. Deleted dependent segments are not physically removed but are marked as deleted. Hierarchic sequential databases can be stored on tape or DASD.
Hierarchic sequentially accessed databases include
- HSAM – In a hierarchic sequential access method (HSAM) database, the segments in each record are stored physically adjacent. Records are loaded sequentially with root segments in ascending key sequence. Dependent segments are stored in hierarchic sequence.
- The record format is fixed-length and unblocked. An HSAM database is updated by rewriting the entire database. Although HSAM databases can be stored on DASD or tape, HSAM is basically a tape-based format.
- IMS identifies HSAM segments by creating a two-byte prefix consisting of a segment code and a delete byte at the beginning of each segment. HSAM segments are accessed through two operating system access methods:
Basic sequential access method (BSAM)
Basic sequential access method (BSAM) is an access method to read and write datasets sequentially. BSAM is—as its name says—basic, in this specific context meaning unbuffered with no deblocking of reads and no blocking of writes, although buffering is an option, but neither deblocking nor blocking.
Queued sequential access method (QSAM)
QSAM is always used as the access method when the system is processing online
SHSAM
A Simple HSAM (SHSAM) database contains only one type of segment-a fixed-length root segment.
HISAM
- Like HSAM, HISAM databases store segments within each record in physically adjacent sequential order. Unlike HSAM, each HISAM record is indexed, allowing direct access to each record. HISAM databases also provide a method for sequential access when required. HISAM databases are stored on DASD.
- A HISAM database is stored in a combination of two data sets. The database index and all segments in a database record that fit into one logical record are stored in a primary data set that is a VSAM KSDS. Remaining segments are stored in the overflow data set, which is a VSAM ESDS. The index points to the CI containing the root segment, and the logical record in the KSDS points to the logical record in the ESDS, if necessary.
SHISAM
Simple HISAM (SHISAM) database contains only a root segment, and its segment has no prefix portion. SHISAM databases can use only VSAM as their access method. The data must be stored in a KSDS.
GSAM
Generalized sequential access method (GSAM) databases are designed to be compatible with MVS data sets.
They are used primarily when converting from an existing MVS-based application to IMS because they allow access to both during the conversion process.
To be compatible with MVS data sets, GSAM databases have no hierarchy, database records, segments, or keys. GSAM databases can be based on the VSAM or QSAM/BSAM MVS access methods.
Hierarchic Direct Databases
- Pointers are used to relate segments.
- Deleted segments are physically removed.
- VSAM ESDS or OSAM data sets are used for storage.
- HD databases are stored on DASD.
- HD databases are of a more complex organization than sequentially organized databases.
HDAM
- HDAM databases are typically used when fast access is needed to the root segment of the database record, usually by direct access. In a hierarchic direct access method (HDAM) database, the root segments of records are randomized to a storage location by an algorithm that converts a root’s key into a storage location.
- No index or sequential ordering of records or segments is involved. The randomizing module reads the root’s key and, through an arithmetic technique, determines the storage address of the root segment. The storage location to which the roots are randomized are called anchor points or root anchor points (RAPs).
- The randomizing algorithm usually attempts to achieve a random distribution of records across the data set. Theoretically, randomizing the location of records minimizes the number of accesses required to retrieve a root segment.
- The randomizing technique results in extremely fast retrieval of data, but it usually does not provide for sequential retrieval of records. This can be achieved in HDAM databases through the use of secondary indexes or by using a physical-key-sequencing randomizer module.
- The advantage of HDAM is that it does not require reading an index to access the database. The randomizing module provides fast access to root segments and to the paths of dependent segments. It uses only the paths of the hierarchy needed to reach the segment being accessed, further increasing access speed. The disadvantage is that HDAM databases cannot be processed in key sequence unless the randomizing module stores root segments in physical key sequence.
HIDAM
- Unlike HDAM, HIDAM databases use an index to locate root segments. HIDAM databases are typically used to access database records randomly and sequentially and also access segments randomly within a record.
- The index and the database are stored in separate data sets. The index is stored as a single VSAM KSDS. The database is stored as a VSAM ESDS or OSAM data set. The index stores the value of the key of each root segment, with a four-byte pointer that contains the address of the root segment.
- The root segment locations in the index are stored in sequential order, allowing HIDAM databases to be processed directly or sequentially. A disadvantage of HIDAM databases is that the additional step required to scan an index makes access slower than with HDAM databases.
- Mentioning PTR=TB or PTR=HB for root segments in HIDAM databases:
When accessing a record by root key, IMS searches for the key in the index and uses the pointer to go directly to the record.
- If the PTR =TB or PTR=HB (twin backward pointer or hierarchic backward pointer) parameter is defined for the root, the root segments are chained together in ascending order. Sequential processing is done by following this pointer chain.
- In HIDAM, Raps are generated only if the PTR=T or PTR=H (twin pointer or hierarchic pointer) parameter is specified for the root. When either of these pointer parameters is defined, IMS puts one RAP at the beginning of the CI or block.
- Root segments within the CI or block are chained by pointers from the most recently inserted back to the first root on the RAP. The result is that the pointers from one root to the next cannot be used to process roots sequentially. Sequential processing must be performed by using key values, which requires the use of the index and increases access time. For this reason, PTR=TB or PTR=HB should be specified for root segments in HIDAM databases.
PHDAM databases
PHDAM databases are partitioned HDAM databases. Each PHDAM database is divided into a maximum of 1001 partitions which can be treated as separate databases. A PHDAM database is also referred to as a High Availability Large Database (HALDB).
Fast Path Databases
Fast Path databases provide fast access with limited functionality. Two types of databases can be used with the Fast Path feature of IMS. They are data entry databases (DEDBs) and main storage databases (MSDBs).
DEDB
- DEDBs are similar in structure to an HDAM database, but with some important differences. DEDBs are stored in special VSAM data sets called areas. The unique storage attributes of areas are a key element of the effectiveness of DEDBs in improving performance.
- While other database types allow records to span data sets, a DEDB always stores all of the segments that make up a record in a single area. The result is that an area can be treated as a self-contained unit.
- In the same manner, each area is independent of other areas. An area can be taken offline, for example, while a reorganization is performed on it. If an area fails, it can be taken offline without affecting the other areas.
- Areas of the same DEDB can be allocated on different volumes or volume types. Each area can have its own space management parameters. A randomizing routine chooses each record location, avoiding buildup on one device. These capabilities allow greater I/O efficiency and increase the speed of access to the data.
- An important advantage of DEDB areas is the flexibility they provide in storing and accessing self-contained portions of a databases.
MS DB
- Main storage databases (MSDBs) are so named because the entire database is loaded into main storage when processing begins. This makes them extremely fast, because database segments do not have to be retrieved from DASD. Most IS shops reserve MSDBs for a site’s most frequently accessed data, particularly data that requires a high transaction rate. The fact that MSDBs require memory storage limits their size.
- Main storage databases (MSDBs) are so named because the entire database is loaded into main storage when processing begins. This makes them extremely fast, because database segments do not have to be retrieved from DASD. Most IS shops reserve MSDBs for a site’s most frequently accessed data, particularly data that requires a high transaction rate. The fact that MSDBs require memory storage limits their size.
- A segment is the smallest structure of the database in the sense that IMS cannot retrieve data in an amount less than a segment. Segments can be broken down into smaller increments called fields, which can be addressed individually by application programs.
In IMS, segments are defined by the order in which they occur and by their relationship with other segments:
- Root segment – The first, or highest segment in the record. There can be only one root segment for each record. There can be many records in a database.
- Dependent segment – All segments in a database record except the root segment.
- Parent segment – A segment that has one or more dependent segments beneath it in the hierarchy.
- Child segment – A segment that is a dependent of another segment above it in the hierarchy.
- Twin segment – A segment occurrence that exists with one or more segments of the same type under a single parent.
A record is defined as a root segment with all its dependent segments. A database record can contain a maximum of 255 types of segments.
Database Description (DBD)
- The DBD describes the physical structure of the database and also the access methods to be used. It is a series of macro statements that define the type of the DB, all segments and fields, logical relationships and indexing.
- DBD statements are submitted to the DBDGEN utility, which generates a DBD control block and stores it in the IMS.DBDLIB library for use when an application program accesses the database.
A sample DBD is given below:
DBD NAME=EMPDBD, ACCESS=HIDAM ROOTSEGM DATASET DD1=EMPDAT, SIZE=4096, FRSPC=(00,05), DEVICE=3390 SEGM NAME=EMPLOC, PARENT=0, PTR=TB, COMPRTN=(IMSHRINK,DATA,INIT),BYTES=768 LCHILD NAME=(X1LOCKEY,CUSTX1), PTR=INDX FIELD NAME=(LOCKEY,SEQ,U),START=1,TYPE=C,BYTES=11 LCHILD NAME=(X2LOCKEY,CUSTX2), PTR=INDX XDFLD NAME=MNENX2, SEGMENT=EMPLOC, SUBSEQ=/SX2, SRCH=(LOCMNEN,LOCTOWN) FIELD NAME=/SX2 FIELD NAME=LOCMNEN,START=39,TYPE=C,BYTES=7 FIELD NAME=LOCCORP,START=1,TYPE=C,BYTES=3 FIELD NAME=LOCTOWN,START=4,TYPE=C,BYTES=3 SEGM NAME=EMPEDU, PARENT=((EMPLOC,SNGL)), PTR=T, COMPRTN=(IMSHRINK,DATA,INIT),BYTES=640 FIELD NAME=(EDUSQNO,SEQ,U),START=1,TYPE=C,BYTES=2 FIELD NAME=EDUSCHOOL,START=3,TYPE=C,BYTES=6 FIELD NAME=EDUDEGREE,START=47,TYPE=C,BYTES=8 FIELD NAME=EDUYEAR,START=55,TYPE=C,BYTES=8 -----etc---- DBDGEN FINISH END
The DBD contains the following statements:
- DBD Names the database being described and specifies its organization.
- DATASET Defines the DDname and block size of a data set. One DATASET statement is required for each data set group.
- SEGM Defines a segment type, its position in the hierarchy, its physical characteristics, and its relationship to other segments. Up to 15 hierarchic levels can be defined. The maximum number of segment types for a single database is 255.
- FIELD Defines a field within a segment. The maximum number of fields per segment is 255. The maximum number of fields per database is 1,000.
- LCHILD Defines a secondary index or logical relationship between two segments. It also is used to define the relationship between a HIDAM index and the root segment of the database.
- XDFLD Used only when a secondary index exists. It is associated with the target segment and specifies the name of the indexed field, the name of the source segment, and the field to be used to create the secondary index.
- DBDGEN Indicates the end of statements defining the DBD.
- END Indicates to the assembler that there are no more statements.