What is Data ?
‘A representation of facts or instruction in a form suitable for communication’ – IBM Dictionary
‘Is a repository for stored data’ – C.J.Date
What is a database system?
An integrated and shared repository for stored data or collection of stored operational data used by application systems of some particular enterprise or ‘Nothing more than a computer-based record keeping system’.
Advantages of DBMS over File Management Systems
- Data redundancy
- Multiple views
- Shared data
- Data independence (logical/physical)
- Data dictionary
- Search versatility
- Cost effective
- Security & Control
- Recovery restart & Backup
TYPES OF DATABASES
- Hierarchical Model
- Network Model
- Relational Model
- Object-Oriented Model
- Top down structure resembling an upside-down tree
- Parent child relationship
- First logical database model
- Available on most of the Mainframe computers. Example – IMS
- Does not distinguish between parent and child. Any record type can be associated with any number of arbitrary record types
- Enhanced to overcome limitations of other models but in reality, there is minimal difference due to frequent enhancements. Example – IDMS
- Data stored in the form of tables consists of multiple rows and columns.
Examples – DB2, Oracle, Sybase, Ingres etc.
OBJECT -ORIENTED MODEL
Data attributes and methods that operate on those attributes are encapsulated in structures called objects
RELATIONAL DB CONCEPTS
Why Relational ? – Relation is a mathematical term for a table – Hence Relational database ‘is perceived’ by the users as a set of tables.
– All data values are atomic.
– Entries in columns are from the same domain
– Sequence of rows (T-B) is insignificant
– Each row is unique
– Sequence of columns (L-R) is insignificant
Relational Concepts (Terminology)
- Relation : A table or File
- Tuple : Row contains an entry for each attribute
- Attributes : Columns or the characteristics that define the entity
- Domain:. A range of values (or Pool)
- Entity : Some object about which we wish to store information
- Null : Represents an unknown/empty value
- Atomic Value: Smallest unit of data; the individual data value
- Candidate key : Some attribute (or a set of attributes) that may uniquely identify each row(tuple) in the relation(table)
- Primary key : The candidate key that is chosen for primary attributes to uniquely identify each row.
- Alternate key :The remaining candidate keys that were not chosen as primary key
- Foreign key :An attribute of one relation that might be a primary key of another relation.
Normalization (1NF – 5NF)
It is done to bring the design of database to a standardized mode or (form)
- 1NF : All entities must have a unique identifier, or key, that can be composed of one or more attributes. All attributes must be atomic and non repeating.
- 2NF : Partial functional dependencies removed – all attributes that are not a part of the key must depend on the entire key for that entity.
- 3NF : Transitive dependencies removed – attributes that are not a part of the key must not depend on any non-key attribute.
- 4NF : Multi valued dependencies removed
- 5NF : Remaining anomalies removed
Types of Integrity
- Entity Integrity : Is a state Where no column that is part of a primary key can have a null values.
- Referential Integrity : Is a state Where every foreign key in the first table must either match a primary key value in the second table or must be wholly null
CODD’s RELATIONAL RULES
- All information in a relational database is represented explicitly at the logical level and in exactly one way – by values in tables
- Each and every datum(atomic value) in a relational database is guaranteed to be logically accessible by resorting to a combination of tablename, primary key value, and column name.
- Null values are supported for representing missing information in a systematic way irrespective of the datatype.
- The database description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data.
- A relational system may support several languages and various modes of terminal use. However there must be one language whose statements can express all of the following items: (1)data definitions (2)view definitions (3)data manipulation(interactive and by program)(4) integrity constraints (5) authorization(6) transaction boundaries(begin, commit, rollback)
- All views are theoretically updatable, are also updatable by the system
- The capability of handling a base relation or a derived relation (view) as a single operand applies not only to the retrieval of of data but also to the insertion, updation and deletion of data
- Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods
- Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
- Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
- The data manipulation sublanguage of a relational DBMS must enable application programs and inquiries to remain logically the same whether and whenever data are physically centralized or distributed.
- If a relational system has a low-level(single-record-at-a-time)language, that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language(multiple-records-at-a-time)
Example of a Relational Structure
CUSTOMER Places ORDERS
ORDERS Has PRODUCTS
The above relations can be interpreted as follows :
- Each order relates to only one customer (one-to-one)
- Many orders can contain many products (many-to-many)
- A CUSTOMER can place any number of orders (one-to-many)
Entity Relationship Model:
- In the above example CUSTOMER, Order & Product are called ENTITIES.
- An Entity may transform into table(s).
- The unique identity for information stored in an ENTITY is called a PRIMARY KEY. E.g… CUSTOMER-No uniquely identifies each customer
A table essentially consists of
– Attributes, which define the characteristics of the table
– Primary key, which uniquely identifies each row of data stored in a table
– Secondary & Foreign Keys/indexes
Table Definition :
Table ‘CUSTOMER’ –
Attributes – CUST_NO, CUST_NAME,
CUST-LOCATION, CUST_ID, ORDER_NO…
Primary Key – CUSTOMER_NO
Secondary Key – CUST_ID
Foreign-Key – ORDER_NO
- The Relationships transform into Foreign Keys. For e.g.. CUSTOMER is related to Orders through ‘ORDER_NO’ which is the Foreign-key in CUSTOMER and Primary key in Order. So basically the relationship ‘Places’ is through the ORDER_NO.
- As per the relational integrity the Primary-Key, ORDER_NO, for the table ‘Orders’ can never be Null, while it can be so in the table ‘CUSTOMER’.
- Tables exist in Tablespaces. A tablespace can contain one or more tables
- Apart from the Primary Key, a table can have many secondary keys/indexes, which exist in Indexspaces.
- These tablespaces and indexspaces together exist in a Database
- To do transformations as described above we need a tool that will provide a way of creating the tables, manipulate the data present in these, create relationships, indexes, tablespace, indexspace and so on. DB2 provides SQL which performs these functions.
Related: DB2 Tutorial Part-2 here