Normalisation
Learning Outcomes:
- Explain the concepts of data redundancy and discuss the methods or measures used to reduce data redundancy.
- Students should be able to reduce data redundancy through normalization up to Third Normal Form (3NF).
Relevant past paper:
DSE ICT Elec A(SP-2016): SP 4c. PP 4c. 2012 3c, 4c. 2013 4c. 2014 2f. 2016 3b,dii.
AS CA(2000-2013): 2008 5b. 2009 2bc. 2011 3b. 2013 5a.
AL CS P2: 2004 1b.
To tackle the problem of data redundancy, we can perform normalisation.
First normal form (1NF)
Rules: 1. no multi-valued attribute/no repeating element
2. Has a primary key
StudentSubjects1
StuID(key)
|
FirstName
|
LastName
|
Subjects
|
001
|
David
|
Chan
|
ICT, Physics
|
The Subjects field is a multi-valued attribute as it allows multiple values in the attribute.
To convert it into 1NF
StudentSubjects2
StuID(key)
|
Subjects(key)
|
FirstName
|
LastName
|
001
|
ICT
|
David
|
Chan
|
001
|
Physics
|
David
|
Chan
|
(StuID+Subjects as a composite key)
Functional dependency
Before going into 2NF and 3NF, you need to learn the concept of functional dependency.
Club
ClubID(key)
|
ClubName
|
1001
|
Running Club
|
1002
|
Swimming Club
|
ClubID determines exactly one value of ClubName. Therefore, ClubName is said to be functionally dependent on ClubID.
i.e. ClubID 1001 will give you the ClubName Running Club, not Swimming Club.
Second normal form (2NF)
Rules: 1. in 1NF
2. No partial functional dependency between primary key fields and non-key fields
Partial functional dependency means some non-key fields depend on part of the key (instead of the primary key as a whole). (i.e.: Partial dependency can only happen when there is a composite key)
In the table StudentSubjects2, FirstName and LastName are functionally dependent on StuID, which is only a part of the composite key StuID+Subjects.
To convert it into 2NF
StudentSubjects
StuID(key)
|
Subjects(key)
|
001
|
ICT
|
001
|
Physics
|
Students
StuID(key)
|
FirstName
|
LastName
|
001
|
David
|
Chan
|
Third normal form (3NF)
Rules: 1. in 2NF
2. No transitive functional dependency between primary key field(s) and non-key fields
Transitive functional dependency happens when non-key field depends on another non-key field (instead of the primary key field).
ClubLeader
ClubID(key)
|
StuID
|
FirstName
|
LastName
|
1001
|
001
|
David
|
Chan
|
StuID is functionally dependent on ClubID.
Yet, the FirstName and LastName are functionally dependent on StuID, but not ClubID.
To convert it into 3NF
ClubLeader
ClubID(key)
|
StuID
|
1001
|
001
|
Students
StuID(key)
|
FirstName
|
LastName
|
001
|
David
|
Chan
|
Advantage of normalisation: reduce data redundancy
Disadvantage of normalisation: the table(s) are decomposed into smaller tables. Multiple join operation may be needed to retrieve data, which can reduce database performance.
Question:
Is the table in 2NF? (Give the reason(rule), and state an example using the table provided)
Yes – because it is in 1NF and there is no partial dependency. All attributes fully depend on whole key______(write down the primary key).
No – because there is partial dependency. ____(an attribute) depends on part of the key ___(part of a composite key).
No – the table is not in 1NF. (If you have already worked this out in other question)