Share the thoughts ---: Normalization

Sql : Normalization

What is Normalization?

Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

The Normal Forms

The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF). In practical applications, you'll often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form is very rarely seen and won't be discussed in this article.

Before we begin our discussion of the normal forms, it's important to point out that they are guidelines and guidelines only. Occasionally, it becomes necessary to stray from them to meet practical business requirements. However, when variations take place, it's extremely important to evaluate any possible ramifications they could have on your system and account for possible inconsistencies. That said, let's explore the normal forms.

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:

* Eliminate duplicative columns from the same table.

* Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicative data:

* Meet all the requirements of the first normal form.

* Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

* Create relationships between these new tables and their predecessors through the use of foreign keys.

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:

* Meet all the requirements of the second normal form.

* Remove columns that are not dependent upon the primary key.

Fourth Normal Form (4NF)

Finally, fourth normal form (4NF) has one additional requirement:

* Meet all the requirements of the third normal form.

* A relation is in 4NF if it has no multi-valued dependencies.

Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.

If you'd like to ensure your database is normalized, explore our other articles in this series:

* Database Normalization Basics

* Putting your Database in First Normal Form

* Putting your Database in Second Normal Form

* Putting your Database in Third Normal Form

Sql - Indexes

Indexes in databases are very similar to indexes in libraries. Indexes allow locating information within a database fast, much like they do in libraries. If all books in a library are indexed alphabetically then you don t need to browse the whole library to find particular book. Instead you ll simply get the first letter from the book title and you ll find this letter s section in the library starting your search from there, which will narrow down your search significantly.

An Index can be created on a single column or a combination of columns in a database table. A table index is a database structure that arranges the values of one or more columns in a database table in specific order. The table index has pointers to the values stored in specified column or combination of columns of the table. These pointers are ordered depending on the sort order specified in the index.

Here is how to use CREATE INDEX SQL statement to create an index on column Model in the Product table, called idxModel:

CREATE INDEX idxModel

ON Product (Model)

The syntax for creating indexes varies greatly amongst different RDBMS, that s why we will not discuss this matter further.

There are some general rules which describe when to use indexes. When dealing with relatively small tables, indexes do not improve performance. In general indexes improve performance when they are created on fields used in table joins. Use indexes when most of your database queries retrieve relatively small datasets, because if your queries retrieve most of the data most of the time, the indexes will actually slow the data retrieval. Use indexes for columns that have many different values (there are not many repeated values within the column). Although indexes improve search performance, they slow the updates, and this might be something worth considering.

Sql - Views

A SQL View is a virtual table, which is based on SQL SELECT query. Essentially a view is very close to a real database table (it has columns and rows just like a regular table), except for the fact that the real tables store data, while the views don t. The view s data is generated dynamically when the view is referenced. A view references one or more existing database tables or other views. In effect every view is a filter of the table data referenced in it and this filter can restrict both the columns and the rows of the referenced tables.

Here is an example of how to create a SQL view using already familiar Product and Manufacturer SQL tables:

CREATE VIEW vwAveragePrice AS

SELECT Manufacturer, ManufacturerWebsite, ManufacturerEmail, AVG(Price) AS AvgPrice

FROM Manufacturer JOIN Product

ON Manufacturer.ManufacturerID = Product.ManufacturerID

GROUP BY Manufacturer, ManufacturerWebsite, ManufacturerEmail

A view can be referenced and used from another view, from a SQL query, and from stored procedure. You reference a view as you would reference any real SQL database table:

SELECT * FROM vwAveragePrice

Sql - Triggers

Triggers are special types of Stored Procedures that are defined to execute automatically in place of or after data modifications. They can be executed automatically on the INSERT, DELETE and UPDATE triggering actions.

There are two different types of triggers in Microsoft SQL Server 2000. They are INSTEAD OF triggers and AFTER triggers. These triggers differ from each other in terms of their purpose and when they are fired. In this article we shall discuss each type of trigger.

First of all, let's create a sample database with some tables and insert some sample data in those tables using the script below:

Create Database KDMNN

USE KDMNN

CREATE TABLE [dbo].[User_Details] (

[UserID] [int] NULL ,

[FName] [varchar] (50) NOT NULL ,

[MName] [varchar] (50) NULL ,

[LName] [varchar] (50) NOT NULL ,

[Email] [varchar] (50) NOT NULL

) ON [PRIMARY]

CREATE TABLE [dbo].[User_Master] (

[UserID] [int] IDENTITY (1, 1) NOT NULL ,

[UserName] [varchar] (50) NULL ,

[Password] [varchar] (50) NULL

) ON [PRIMARY]

ALTER TABLE [dbo].[User_Master] WITH NOCHECK ADD

CONSTRAINT [PK_User_Master] PRIMARY KEY CLUSTERED

(

[UserID]

) ON [PRIMARY]

ALTER TABLE [dbo].[User_Details] ADD

CONSTRAINT [FK_User_Details_User_Master] FOREIGN KEY

(

[UserID]

) REFERENCES [dbo].[User_Master] (

[UserID]

)

INSERT INTO USER_MASTER(USERNAME, PASSWORD)

SELECT 'Navneeth','Navneeth' UNION

SELECT 'Amol','Amol' UNION

SELECT 'Anil','Anil' UNION

SELECT 'Murthy','Murthy'

INSERT INTO USER_DETAILS(USERID, FNAME, LNAME, EMAIL)

SELECT 1,'Navneeth','Naik','navneeth@kdmnn.com' UNION

SELECT 2,'Amol','Kulkarni','amol@kdmnn.com' UNION

SELECT 3,'Anil','Bahirat','anil@kdmnn.com' UNION

SELECT 4,'Murthy','Belluri','murthy@kdmnn.com'

AFTER Triggers

The type of trigger that gets executed automatically after the statement that triggered it completes is called an AFTER trigger. An AFTER trigger is a trigger that gets executed automatically before the transaction is committed or rolled back.

Using the below script, first we shall create a trigger on the table USER_MASTER for the INSERT event of the table.

USE KDMNN

CREATE TRIGGER trgInsert

ON User_Master

FOR INSERT

Print ('AFTER Trigger [trgInsert] – Trigger executed !!')

BEGIN TRANSACTION

DECLARE @ERR INT

INSERT INTO USER_MASTER(USERNAME, PASSWORD)

VALUES('Damerla','Damerla')

SET @ERR = @@Error

IF @ERR = 0

BEGIN

ROLLBACK TRANSACTION

PRINT 'ROLLBACK TRANSACTION'

END

ELSE

BEGIN

COMMIT TRANSACTION

PRINT 'COMMIT TRANSACTION'

END

Output

AFTER Trigger [trgInsert] – Trigger executed !!

(1 row(s) affected)

ROLLBACK TRANSACTION

Sql - Stored Procedures

A Definition and an Example

A stored procedure is a procedure (like a subprogram in a regular computing language) that is stored (in the database). Correctly speaking, MySQL supports "routines" and there are two kinds of routines: stored procedures which you call, or functions whose return values you use in other SQL statements the same way that you use pre-installed MySQL functions like pi(). I'll use the word "stored procedures" more frequently than "routines" because it's what we've used in the past, and what people expect us to use.

A stored procedure has a name, a parameter list, and an SQL statement, which can contain many more SQL statements. There is new syntax for local variables, error handling, loop control, and IF conditions. Here is an example of a statement that creates a stored procedure.

CREATE PROCEDURE procedure1 /* name */

(IN parameter1 INTEGER) /* parameters */

BEGIN /* start of block */

DECLARE variable1 CHAR(10); /* variables */

IF parameter1 = 17 THEN /* start of IF */

SET variable1 = 'birds'; /* assignment */

ELSE

SET variable1 = 'beasts'; /* assignment */

END IF; /* end of IF */

INSERT INTO table1 VALUES (variable1); /* statement */

END /* end of block */

What I'm going to do is explain in detail all the things you can do with stored procedures. We'll also get into another new database object, triggers, because there is a tendency to associate triggers with stored procedures.

Primary Key Constraints

Primary keys are the unique identifiers for each row. They must contain unique values and cannot be null. Due to their importance in relational databases, Primary keys are the most fundamental of all keys and constraints. A table can have only one Primary key. A Primary key ensures uniqueness within the column declared as being part of that Primary key, and that unique value serves as an identifier for each row in that table. There are two ways to create Primary keys: the CREATE TABLE and ALTER TABLE commands. Using a small, integer column as a Primary key is recommended. Each table should have a Primary key.

Foreign Key Constraints

Foreign keys are both a method of ensuring data integrity and a manifestation of the relationship between tables. When we add a Foreign key to the table, we are creating a dependency between the table for which we define the Foreign key (the referencing table) and the table your Foreign key references (the referenced table). Once we have set up a Foreign key for a table, any record inserted into the referencing table must either have a matching record in the referenced column(s) of the referenced table, or the value of the Foreign key column must be set to NOTNULL. Some examples will help to clarify this.

The syntax for the Foreign key is as follows:

FOREIGN KEY REFERENCES

(

name>)

[ON DELETE

{CASCADE | NO ACTION}]

[ON UPDATE {CASCADE | NO ACTION}]

Unique Constraints

The easiest to handle, Unique constraints are essentially the younger siblings to Primary keys. They require unique value throughout the named column or combination of columns in the table. Often, Unique constraints are referred to as Alternate Keys. Alternate keys are not considered to be the unique identifier of a record in a table. Unique constraints can be multiple. Once we establish a Unique constraint, every value in the named column must be unique. SQL Server will show an error if you try to update or insert a row with a value that already exists in a column with a Unique constraint.

Unlike Primary key, a Unique key will not automatically prevent us from entering NULL values, we have to explicitly state the null value of the column. Keep in mind though, if we do allow NULL values, we will still be able to insert only one of them. Although a NULL does not equal another NULL, it is a duplicate value to the Unique constraint.

Saturday, September 20, 2008

Normalization

No comments:

Share the thoughts ---

Blog Archive

About Me