lOMoARcPSD|45315597
lOMoARcPSD| 45315597
lOMoARcPSD|45315597
lOMoARcPSD|45315597
Stored Data
Data: the contents of database itself
Metadata:
the database schema that describes the structure of and constraints
on the database Statistics
information gathered and stored by the DBMS about data properties such as
the size of, and values in, various relations or other components of the
database
Indexes
data structures that support efficient access to the data.
4
lOMoARcPSD|45315597
Transaction Management
Accept transaction commands from an application which tell
the transaction manager when transactions begin and end, as
well as information about the expectations of the application
(some may not wish to require atomicity, for example).
The transaction processor performs the following tasks:
Logging
Recovery
Concurrency control
7
DBMS
lOMoARcPSD|45315597
Transaction Management
Logging
every change in the database is logged separately on disk.
The log manager follows one of several policies designed to assure that no
matter when a system failure or crash occurs.
Recovery manager
able to examine the log changes and restore the database to some consistent
state.
Concurency control
transactions must appear to execute in isolation.
the schedeler (concurrency control manager) must assure that the
individual actions of multiple transactions are executed in such an
order that the net effect in the same as if the transactions had in fact
executed in their entirely, once-at-atime.
A typical scheduler does its work by maintaining locks on certain pieces
of the database.
8
lOMoARcPSD|45315597
Internal schema
describes the physical storage structure of the database.
Conceptual schema
describes the structure of the whole database for a community of users.
External schemas or user views.
describes the part of the database that a particular user group is
interested in and hides the rest of the database from that user group.
10
DBMS Utilities
Loading
To load existing data files into the database
Backup & recovery
To create a backup copy of the database
To recovery an old state of DB from a backup file
File reorganization
To reorganize a database file in order to improve performance
Report generation
To generate reports based on the information from the database
Performance monitoring
To provide the DBA with statistical data about the DB usage.
11
lOMoARcPSD|45315597
12
lOMoARcPSD|45315597
What is Database Tuning?
Activity of making a database application run faster:
Faster means higher throughput (or responsetime)
Avoiding transactions that create bottlenecks
oravoiding queries that run for hours unnecessarily is a
must.
A 5% improvement is significant.
2
lOMoARcPSD|45315597
Why Database Tuning?
Troubleshooting:
Make managers and users happy given an
application and a DBMS Capacity Sizing:
Buy the right DBMS given application requirements
Application Programming:
Coding your application for performance
3
lOMoARcPSD|45315597
Outline
Schema tuning
Index tuning
Query Processing
Query tuning
Transaction Management
Transaction tuning
Tuning Distributed Application
6
lOMoARcPSD|45315597
Tuning Principles
1. Think globally, fix locally
2. Partitioning breaks bottlenecks
temporal and spatial
3. Start-up costs are high; running costs are low
4. Render unto server what is due unto server
5. Be prepared for trade-offs
7
Think globally, fix locally
Proper identification of problem; minimal intervention
Understand the whole, including the application goals
before taking a set of queries and find the indexes that
speed them up.
Example:
High I/O, paging and processor utilization may be due to frequent
query scans instead of using an index or log sharing a disk with
some frequently accessed data.
8
lOMoARcPSD|45315597
Partitioning breaks bottlenecks
Technique for reducing the load on a certain component
of the system either by dividing the load over more
resources or by spreading the load over time
Partitioning may not always solve bottleneck:
First, try to speed up the component
If it doesn’t work, partition
Example:
Lock and resource contention among long and short
transactions
9
Start-up costs are high; running costs are low
Obtain the effect you want with the fewest possible start-
ups
Examples:
It is expensive to begin a read operation on a disk, but once it
starts disk can deliver data at high speed.
So, frequently scanned tables should be laid out consecutively on disk.
Cost of parsing, semantic analysis, and selecting access
paths for simple queries is significant So, often executed
queries should be compiled
lOMoARcPSD|45315597
10
Render unto server what is due unto
server
Important design question is the allocation of work
between the DB system (server) and the application
program (client)
Depends on:
Relative computing resources of client and server
Where the relevant information is located
Whether the DB task interacts with the screen
11
lOMoARcPSD|45315597
Be prepared for trade-offs
Increasing speed of application requires combination of
memory, disk and computational resources
Examples:
Adding an index => speeds up critical query, but increases
disk storage, and space in RAM
Increasing RAM => Decreasing I/O, speed up query, but
spending more money
12
lOMoARcPSD|45315597
lOMoARcPSD|45315597
Database Schema
A relation schema ~ name, {attributes}
R(a int, b varchar[20]);
A relation instance for R ~ a set of records over the
attributes in the schema for R
3
Some schema are better than others ?
Schema1:
OnOrder1(supplier_id, part_id, quantity, supplier_address)
Schema 2:
OnOrder2(supplier_id, part_id, quantity);
Supplier(supplier_id, supplier_address);
4
lOMoARcPSD|45315597
Functional Dependencies
X ~ a set of attributes of relation R, and A is a single
attribute of R.
X -> A holds for R iff:
For any relation instance I of R, whenever there are two
records r and r’ in I with the same X values, they have the same
A value as well.
OnOrder1(supplier_id, part_id, quantity, supplier_address)
supplier_id -> supplier_address
5
Key of a Relation
Subset of attributes X from R constitutant a key of R if
X determines every attribute in R and no proper subset
of X determines an attribute in R.
OnOrder1(supplier_id, part_id, quantity, supplier_address)
supplier_id, part_id is not a key
Supplier(supplier_id, supplier_address);
Supplier_id is a key
6
lOMoARcPSD|45315597
Normalization
A relation is normalized if every interesting
functional dependency X -> A involving attributes in
R has the property that X is a key of R
OnOrder1 is not normalized
OnOrder2 and Supplier are normalized
Here normalized, refers to BCNF (Boyce-Codd Normal Form)
7
Example #1
Suppose that a bank associates each customer with
his or her home branch. Each branch is in a specific
legal jurisdiction.
R(customer, branch, jurisdiction) normalized?
8

Preview text:

lOMoARcPSD| 45315597 lOMoAR cPSD| 45315597 lOMoARcPSD| 45315597 lOMoARcPSD| 45315597 Stored Data
Data: the contents of database itself Metadata:
the database schema that describes the structure of and constraints on the database Statistics
information gathered and stored by the DBMS about data properties such as
the size of, and values in, various relations or other components of the database Indexes
data structures that support efficient access to the data. 4 lOMoARcPSD| 45315597 DBMS Transaction Management
Accept transaction commands from an application which tell
the transaction manager when transactions begin and end, as
well as information about the expectations of the application
(some may not wish to require atomicity, for example).
The transaction processor performs the following tasks: Logging Recovery Concurrency control 7 lOMoARcPSD| 45315597 Transaction Management Logging
every change in the database is logged separately on disk.
The log manager follows one of several policies designed to assure that no
matter when a system failure or crash occurs. Recovery manager
able to examine the log changes and restore the database to some consistent state. Concurency control
transactions must appear to execute in isolation.
the schedeler (concurrency control manager) must assure that the
individual actions of multiple transactions are executed in such an
order that the net effect in the same as if the transactions had in fact
executed in their entirely, once-at-atime.
A typical scheduler does its work by maintaining locks on certain pieces of the database. 8 lOMoARcPSD| 45315597 Internal schema
describes the physical storage structure of the database. Conceptual schema
describes the structure of the whole database for a community of users.
External schemas or user views.
describes the part of the database that a particular user group is
interested in and hides the rest of the database from that user group. 10 DBMS Utilities Loading
To load existing data files into the database Backup & recovery
To create a backup copy of the database
To recovery an old state of DB from a backup file File reorganization
To reorganize a database file in order to improve performance Report generation
To generate reports based on the information from the database Performance monitoring
To provide the DBA with statistical data about the DB usage. 11 lOMoARcPSD| 45315597 12 lOMoARcPSD| 45315597 What is Database Tuning?
Activity of making a database application run faster:
– Faster means higher throughput (or responsetime)
– Avoiding transactions that create bottlenecks
oravoiding queries that run for hours unnecessarily is a must.
– A 5% improvement is significant. 2 lOMoARcPSD| 45315597 Why Database Tuning? Troubleshooting:
Make managers and users happy given an
application and a DBMS Capacity Sizing:
Buy the right DBMS given application requirements Application Programming:
Coding your application for performance 3 lOMoARcPSD| 45315597 Outline Schema tuning Index tuning Query Processing Query tuning Transaction Management Transaction tuning
Tuning Distributed Application 6 lOMoARcPSD| 45315597 Tuning Principles 1. Think globally, fix locally 2.
Partitioning breaks bottlenecks temporal and spatial 3.
Start-up costs are high; running costs are low 4.
Render unto server what is due unto server 5. Be prepared for trade-offs 7 Think globally, fix locally
Proper identification of problem; minimal intervention
Understand the whole, including the application goals
before taking a set of queries and find the indexes that speed them up. Example:
High I/O, paging and processor utilization may be due to frequent
query scans instead of using an index or log sharing a disk with
some frequently accessed data. 8 lOMoARcPSD| 45315597
Partitioning breaks bottlenecks
Technique for reducing the load on a certain component
of the system either by dividing the load over more
resources or by spreading the load over time
Partitioning may not always solve bottleneck:
First, try to speed up the component
If it doesn’t work, partition Example:
Lock and resource contention among long and short transactions 9
Start-up costs are high; running costs are low
Obtain the effect you want with the fewest possible start- ups Examples:
It is expensive to begin a read operation on a disk, but once it
starts disk can deliver data at high speed.
So, frequently scanned tables should be laid out consecutively on disk.
Cost of parsing, semantic analysis, and selecting access
paths for simple queries is significant So, often executed queries should be compiled lOMoARcPSD| 45315597 10
Render unto server what is due unto server
Important design question is the allocation of work
between the DB system (server) and the application program (client) Depends on:
Relative computing resources of client and server
Where the relevant information is located
Whether the DB task interacts with the screen 11 lOMoARcPSD| 45315597 Be prepared for trade-offs
Increasing speed of application requires combination of
memory, disk and computational resources Examples:
Adding an index => speeds up critical query, but increases
disk storage, and space in RAM
Increasing RAM => Decreasing I/O, speed up query, but spending more money 12 lOMoARcPSD| 45315597 lOMoARcPSD| 45315597 Database Schema
• A relation schema ~ name, {attributes} R(a int, b varchar[20]);
• A relation instance for R ~ a set of records over the
attributes in the schema for R 3
Some schema are better than others ? • Schema1:
OnOrder1(supplier_id, part_id, quantity, supplier_address) • Schema 2:
OnOrder2(supplier_id, part_id, quantity);
Supplier(supplier_id, supplier_address); 4 lOMoARcPSD| 45315597 Functional Dependencies
• X ~ a set of attributes of relation R, and A is a single attribute of R. X -> A holds for R iff:
• For any relation instance I of R, whenever there are two
records r and r’ in I with the same X values, they have the same A value as well.
• OnOrder1(supplier_id, part_id, quantity, supplier_address)
supplier_id -> supplier_address 5 Key of a Relation
• Subset of attributes X from R constitutant a key of R if
X determines every attribute in R and no proper subset
of X determines an attribute in R.
• OnOrder1(supplier_id, part_id, quantity, supplier_address)
– supplier_id, part_id is not a key
• Supplier(supplier_id, supplier_address); – Supplier_id is a key 6 lOMoARcPSD| 45315597 Normalization
• A relation is normalized if every interesting
functional dependency X -> A involving attributes in
R has the property that X is a key of R
• OnOrder1 is not normalized
• OnOrder2 and Supplier are normalized
• Here normalized, refers to BCNF (Boyce-Codd Normal Form) 7 Example #1
• Suppose that a bank associates each customer with
his or her home branch. Each branch is in a specific legal jurisdiction.
R(customer, branch, jurisdiction) normalized? 8