Denormalisation pros and cons | Bài giảng môn thiết kế và quản trị cơ sở dữ liệu| Trường Đại học Bách Khoa Hà Nội

OBJECTIVES

• Definition of terms.

• Describe the denormalization design process.

• Denormalization Strategies

• A Comparative Case Study

• Know the pros and cons of denormalization

• The Dangerous Illusion

• Conclude

TM 1
Dr. Chen, Business Database Systems
Presented By Aliya Saldanha
DENORMALISATION
PROS AND CONS
TM 2
Dr. Chen, Business Database Systems
OBJECTIVES
Definition of terms.
Describe the denormalization design process.
Denormalization Strategies
A Comparative Case Study
Know the pros and cons of denormalization
The Dangerous Illusion
Conclude
TM 3
Dr. Chen, Business Database Systems
Introduction
RDBMS design - conceptual and physical
modeling levels.
Conceptual diagrams - precursor to designing
relational tables.
Critical issues- level of system performance,
reflected by system response time
TM 4
Dr. Chen, Business Database Systems
Normalization
The normalized model is
a cornerstone for every
database system.
Process of decomposing
large, inefficiently
structured tables into
smaller, more structured
tables without losing any
data in the process.
There are still times where we denormalize a database to enhance performance
TM 5
Dr. Chen, Business Database Systems
What is normalization?
A series of steps followed to obtain a database
that is consistent and avoids duplication
The process passes through fulfilling Normal Forms
A table is said to be in a certain normal form if
it satisfies certain constraints
KEY POINTS
Each table represents a single subject
Keeps redundancy to a minimum
All attributes are dependent on the primary key
Checks stability and integrity of E-R diagram
Removes Insert, Update, Delete anomalies.
1
st
Normal
Form
2
nd
Normal
Form
3
rd
Normal
Form
BCNF
4
th
Normal
Form
5
th
Normal
Form
Normalized relational db
model
Relational db model
TM 6
Dr. Chen, Business Database Systems
As normalization progresses…
The number of Relations required to represent
the data of the application being normalized
increases.
The increased number of tables require multiple
JOIN’s to combine data from different tables.
(more the joins the worse it gets)
Queries that have a lot of complex joins will
require more CPU usage and will adversely affect
performance.
TM 7
Dr. Chen, Business Database Systems
Practically speaking
Queries run slowly.
Reports take too long to print.
On-screen forms take time to populate.
Web pages take too long to populate.
More complicated SQL required for multi-table
queries and joins.
In short, extra work for DBMS can mean slower
applications
TM 8
Dr. Chen, Business Database Systems
Other issues…
No calculated values. CV’s are a fact of life for all applications,
but a normalized DBMS lacks them.
Non-reproducible Calculations. Application must generate them
on the fly as needed. If your application changes over time, you risk
not being able to reproduce prior results.
Join Jungles. When each fact is stored in exactly one place, you it
is daunting to pull together everything for a certain query. Making it
hard to code, hard to debug, and dangerous to alter.
Performance. When you face a JOIN jungle you almost always
face performance problems.
TM 9
Dr. Chen, Business Database Systems
??before denormalizing
Can the system achieve acceptable performance
without denormalizing?
Will the performance of the system after denormalizing
still be unacceptable?
Will the system be unreliable due to denormalization?
If the answer to any of these is "yes," avoid
denormalization because any benefit that is accrued
will not exceed the cost.
TM 10
Dr. Chen, Business Database Systems
Denormalization and Why?
Frequently, performance needs dictate very quick
retrieval capability for data stored in relational
databases.
To accomplish this, sometimes the decision is made
to denormalize the physical implementation.
Denormalization is the process of putting one fact in
numerous places. This speeds data retrieval at the
expense of data modification.
TM 11
Dr. Chen, Business Database Systems
Does it mean Un-normalization ?
‘Denormalization’ does not mean that
anything goes. Denormalization does not
mean chaos.
Un-normalized data model is little or no
analysis is performed.
In short, seek to denormalize a data model
that has already been normalized.
TM 12
Dr. Chen, Business Database Systems
DENORMALIZATION
PROCESS
Develop E-R
Refinement &Normalize
Identify candidates
Identifying form
Map to physical schema
Determining integrity effects
TM 13
Dr. Chen, Business Database Systems
Development of Conceptual data
model
E-R/M aims at identifying the entities that are part of
the system, the attributes that make up these entities,
and the dependencies between entities.
No Dependency among the attributes –
Normalization resolves the functional dependencies
between attributes
Shows Data at rest – Denormalization considers the
types of queries and their frequency
1
TM 14
Dr. Chen, Business Database Systems
Refinement and normalization
The ERD is further refined, in order to resolve the functional
dependencies between the attributes of an Entity.
May lead to splitting of tables to reduce data redundancy.
Identifying candidates for denormalization
Application performance criteria.
Type of queries to be executed (update/retrieve).
Frequency of queries
Number of rows accessed by each transaction.
Cardinality – 1:1, 1:M
Derived data, Lookup data
2
3
TM 15
Dr. Chen, Business Database Systems
Determine effect on data
integrity
The effect of denormalization is reviewed.
Denormalizing may lead to performance
degradation
Or unacceptable consistency issues.
In such a case Denormalization decision
must be reconsidered
4
TM 16
Dr. Chen, Business Database Systems
Form for denormalized entity
Identifying what form the denormalized
entity may take
We move down the normal forms ladder of
steps.
5
Map conceptual scheme to
physical scheme.
Once the scheme is tested and verified it is
implemented.
6
TM 17
Dr. Chen, Business Database Systems
DENORMALIZATION
STRATEGIES
Pre joined Tables
Report Tables
Mirror Tables
Split Tables
Redundant Data
Repeating Groups
Derivable Data
Speed Tables
TM 18
Dr. Chen, Business Database Systems
Pre-joined tables
Two or more tables are joined and the result is stored
as another table.
When the cost of joining is prohibitive
Example: Retail store databases
Contain only those columns absolutely necessary for
application to meet processing needs.
The pre-joined table must be created periodically using
SQL to join the normalized tables.
TM 19
Dr. Chen, Business Database Systems
1:1 Relationships
TM 20
Dr. Chen, Business Database Systems
M:M Relationship
| 1/65

Preview text:

DENORMALISATION PROS AND CONS
Presented By Aliya Saldanha
Dr. Chen, Business Database Systems TM 1 OBJECTIVES • Definition of terms.
• Describe the denormalization design process. • Denormalization Strategies • A Comparative Case Study
• Know the pros and cons of denormalization • The Dangerous Illusion • Conclude
Dr. Chen, Business Database Systems TM 2 Introduction
• RDBMS design - conceptual and physical modeling levels.
• Conceptual diagrams - precursor to designing relational tables.
• Critical issues- level of system performance,
reflected by system response time
Dr. Chen, Business Database Systems TM 3 Normalization • The normalized model is a cornerstone for every database system. • Process of decomposing large, inefficiently structured tables into smaller, more structured tables without losing any data in the process.
There are still times where we denormalize a database to enhance performance
Dr. Chen, Business Database Systems TM 4 What is normalization? Relational db model
• A series of steps fol owed to obtain a database
• that is consistent and avoids duplication 1st Normal Form
• The process passes through fulfil ing Normal Forms 2nd Normal Form
• A table is said to be in a certain normal form if 3rd Normal
it satisfies certain constraints Form BCNF • KEY POINTS 4th Normal
• Each table represents a single subject Form 5th Normal
• Keeps redundancy to a minimum Form
• All attributes are dependent on the primary key
• Checks stability and integrity of E-R diagram
• Removes Insert, Update, Delete anomalies. Normalized relational db
Dr. Chen, Business Database Systems TM 5 model
As normalization progresses…
• The number of Relations required to represent
the data of the application being normalized increases.
• The increased number of tables require multiple
JOIN’s to combine data from different tables.
(more the joins the worse it gets)
• Queries that have a lot of complex joins will
require more CPU usage and will adversely affect performance.
Dr. Chen, Business Database Systems TM 6 Practically speaking • Queries run slowly.
• Reports take too long to print.
• On-screen forms take time to populate.
• Web pages take too long to populate.
• More complicated SQL required for multi-table queries and joins.
• In short, extra work for DBMS can mean slower applications
Dr. Chen, Business Database Systems TM 7 Other issues…
No calculated values. CV’s are a fact of life for all applications,
but a normalized DBMS lacks them. •
Non-reproducible Calculations. Application must generate them
on the fly as needed. If your application changes over time, you risk
not being able to reproduce prior results. •
Join Jungles. When each fact is stored in exactly one place, you it
is daunting to pull together everything for a certain query. Making it
hard to code, hard to debug, and dangerous to alter. •
Performance. When you face a JOIN jungle you almost always face performance problems.
Dr. Chen, Business Database Systems TM 8 ??before denormalizing
• Can the system achieve acceptable performance without denormalizing?
• Will the performance of the system after denormalizing still be unacceptable?
• Will the system be unreliable due to denormalization?
• If the answer to any of these is "yes," avoid
denormalization because any benefit that is accrued will not exceed the cost.
Dr. Chen, Business Database Systems TM 9
Denormalization and Why?
• Frequently, performance needs dictate very quick
retrieval capability for data stored in relational databases.
• To accomplish this, sometimes the decision is made
to denormalize the physical implementation.
• Denormalization is the process of putting one fact in
numerous places. This speeds data retrieval at the expense of data modification.
Dr. Chen, Business Database Systems TM 10
Does it mean Un-normalization ?
• ‘Denormalization’ does not mean that
anything goes. Denormalization does not mean chaos.
• Un-normalized data model is little or no analysis is performed.
• In short, seek to denormalize a data model
that has already been normalized.
Dr. Chen, Business Database Systems TM 11 Develop E-R DENORMALIZATION PROCESS
Refinement &Normalize Identify candidates
Determining integrity effects Identifying form Map to physical schema
Dr. Chen, Business Database Systems TM 12
Development of Conceptual data 1 model
• E-R/M aims at identifying the entities that are part of
the system, the attributes that make up these entities,
and the dependencies between entities.
• No Dependency among the attributes –
Normalization resolves the functional dependencies between attributes
• Shows Data at rest – Denormalization considers the
types of queries and their frequency
Dr. Chen, Business Database Systems TM 13
2 Refinement and normalization
• The ERD is further refined, in order to resolve the functional
dependencies between the attributes of an Entity.
• May lead to splitting of tables to reduce data redundancy.
Identifying candidates for denormalization
• Application performance criteria.
• Type of queries to be executed (update/retrieve). • Frequency of queries 3
• Number of rows accessed by each transaction. • Cardinality – 1:1, 1:M • Derived data, Lookup data
Dr. Chen, Business Database Systems TM 14 4
Determine effect on data integrity
• The effect of denormalization is reviewed.
• Denormalizing may lead to performance degradation
• Or unacceptable consistency issues.
• In such a case Denormalization decision must be reconsidered
Dr. Chen, Business Database Systems TM 15
5 Form for denormalized entity
• Identifying what form the denormalized entity may take
• We move down the normal forms ladder of steps.
Map conceptual scheme to physical scheme.
6 • Once the scheme is tested and verified it is implemented.
Dr. Chen, Business Database Systems TM 16 DENORMALIZATION STRATEGIESPre joined TablesRedundant DataReport TablesRepeating GroupsMirror TablesDerivable DataSplit TablesSpeed Tables
Dr. Chen, Business Database Systems TM 17 Pre-joined tables
Two or more tables are joined and the result is stored as another table.
When the cost of joining is prohibitive
Example: Retail store databases
Contain only those columns absolutely necessary for
application to meet processing needs.
The pre-joined table must be created periodically using
SQL to join the normalized tables.
Dr. Chen, Business Database Systems TM 18 1:1 Relationships
Dr. Chen, Business Database Systems TM 19 M:M Relationship
Dr. Chen, Business Database Systems TM 20