Thiết kế và quản trị cơ sở dữ liệu_Thầy Trần Việt Trung| Bài giảng môn thiết kế và quản trị cơ sở dữ liệu| Trường Đại học Bách Khoa Hà Nội

Thiết kế và quản trị cơ sở dữ liệu_Thầy Trần Việt Trung| Bài giảng môn thiết kế và quản trị cơ sở dữ liệu| Trường Đại học Bách Khoa Hà Nội. Tài liệu gồm 585 trang giúp bạn ôn tập và đạt kết quả cao trong kỳ thi sắp tới. Mời bạn đọc đón xem.

Database tuning
Viet-Trung Tran
SoICT
3/23/23 Database Tuning
1
Outline
Course organization
Introduction to database tuning
Basic principles of tuning
3/23/23 Database Tuning
2
Online resource
Microsoft Teams
Discussion
Exercise
Middle & final grade
Slides and everything else
(Optional)
https://www.facebook.com/groups/trungtv.students/
Future career opportunities
Email: trungtv@soict.hust.edu.vn
3/23/23 Database Tuning
3
Course organization
1. Introduction to database tuning
2. Optimization of schema and data types
3. Reading topic 1: Benchmarking and profiling MySQL server
4. Query tuning 1
5. Query tuning 2
6. Reading topic 2: Advance MySQL features
7. Index tuning 1
8. Index tuning 2
9. Reading topic 3: Replication in MySQL
10. Explain command
11. Concurrency tuning
12. Reading topic 4: High availability and scaling MySQL
13. NoSQL data models
14. Reading topic 5: Backup and recovery
15. Recap
3/23/23 Database Tuning
4
What is database tuning?
Activity to make database application running faster
Faster I/O operations: INSERT, SELECT, DELETE, UPDATE,
And also optimizing storage space, network usage, etc.
A 5% improvement is important
3/23/23 Database Tuning
5
Tuning parameters
Everything that make sense
Faster disk
More Ram
Effective index
Good queries
There is always a cost/trade-off
Some time cost is low and the benefit very high
3/23/23 Database Tuning
6
Tuning between theory and practice
Practitioner
learning by experiences
Ex. Never use aggregate functions (AVG) when transaction response
time is critical
Problem: AVG can be ok if less tuples
Theoretician
Learning by mathematical models
Ex. Different between indexes
Problem: rely on ideal assumptions (rare in reality)
3/23/23 Database Tuning
7
Database tuner
Understand and apply principles
Understanding: the problem is not about AVG, but scanning large
amount of data (which AVG often does…)
Principle: Do not scan large amount of data in concurrency
Apply principle wisely
3/23/23 Database Tuning
8
Five basic tuning principles
Think globally, fix locally
Partitioning to break bottlenecks
Start-up costs are high, running costs are low
Render on the server what is due on the server
Be prepared for trade-offs
3/23/23 Database Tuning
9
Think globally, fix locally
Disk activity is high, what to do?
Solution 1: buy more disk
Solution 2: Speedup queries with longest runtime
Solution 3: Speedup queries with largest share in runtime
3/23/23 Database Tuning
10
Partitioning breaks bottlenecks
Rarely all parts of a system are saturated
Partitioning strategies
Divide load over more resources (add lanes)
Spread load over time (avoid rush hours)
3/23/23 Database Tuning
11
Start-up costs are high, running costs are low
Reading operation
Disk seek is so expense
Continuous read is cheep
Conclusion
Frequently scanned tables should be serialized sequentially on disk
Frequent query that projects few columns: vertically partition table –
column based organization
3/23/23 Database Tuning
12
Network latency
Sending many small messages vs. sending little big message
Ex. Sending 1 byte packet is almost as expense as sending 1
KB packet
3/23/23 Database Tuning
13
Query overhead
Query vs. Store procedure (compiled query)
Compile often executed queries
3/23/23 Database Tuning
14
Connection overhead from programing
languages
Open connection
Significant overhead
Establish connection
User authentication
Connection caching and pooling
Do one SELECT and loops over results vs. Doing SELECT in the
loop
Lesson learned
Obtain results with the fewest possible startups
3/23/23 Database Tuning
15
Be prepared for trade-offs
Making one query faster may slow down other queries
Index can make certain queries faster, but
Addition disk space required
Slow down insert, update
3/23/23 Database Tuning
16
Optimizing Schema and Data Types
Viet-Trung Tran
School of Information and Communication Technology
1
Outline
Data integration
Introduction
Current approaches
Apache Nifi
Hand-ons Apache Nifi
Data preprocessing
Introduction
Data quality
Data preprocessing steps
Hand-ons Openrefine
2
Choosing Optimal Data Types
Smaller is usually better
Faster, less space on disk, in memory, and in the CPU cache
Simple is good
Eg. integers are cheaper to compare than characters
Avoid NULL if possible
nullable columns make indexes, index statistics, and value comparisons
more complicated
3
Whole and real numbers
Whole Numbers
TINYINT: 8 bits
SMALLINT: 16 bits
MEDIUMINT: 24 bits
INT: 32 bits
or BIGINT: 64 bits
optionally have the UNSIGNED attribute
Real Numbers
DECIMAL: storing exact fractional numbers. Eg. DECIMAL(18, 9)
only when you need exact results for fractional numbers
FLOAT: 32 bits
DOUBLE: 64 bits
4
| 1/585

Preview text:

Database tuning Viet-Trung Tran SoICT 3/23/23 Database Tuning 1 Outline • Course organization
• Introduction to database tuning
• Basic principles of tuning 3/23/23 Database Tuning 2 Online resource • Microsoft Teams • Discussion • Exercise • Middle & final grade
• Slides and everything else • (Optional)
https://www.facebook.com/groups/trungtv.students/
• Future career opportunities
• Email: trungtv@soict.hust.edu.vn 3/23/23 Database Tuning 3 Course organization
1. Introduction to database tuning
2. Optimization of schema and data types
3. Reading topic 1: Benchmarking and profiling MySQL server 4. Query tuning 1 5. Query tuning 2
6. Reading topic 2: Advance MySQL features 7. Index tuning 1 8. Index tuning 2
9. Reading topic 3: Replication in MySQL 10. Explain command 11. Concurrency tuning
12. Reading topic 4: High availability and scaling MySQL 13. NoSQL data models
14. Reading topic 5: Backup and recovery 15. Recap 3/23/23 Database Tuning 4 What is database tuning?
• Activity to make database application running faster
• Faster I/O operations: INSERT, SELECT, DELETE, UPDATE,
• And also optimizing storage space, network usage, etc.
• A 5% improvement is important 3/23/23 Database Tuning 5 Tuning parameters
• Everything that make sense • Faster disk • More Ram • Effective index • Good queries
• There is always a cost/trade-off
• Some time cost is low and the benefit very high 3/23/23 Database Tuning 6
Tuning between theory and practice • Practitioner • learning by experiences
• Ex. Never use aggregate functions (AVG) when transaction response time is critical
• Problem: AVG can be ok if less tuples • Theoretician
• Learning by mathematical models
• Ex. Different between indexes
• Problem: rely on ideal assumptions (rare in reality) 3/23/23 Database Tuning 7 Database tuner
• Understand and apply principles
• Understanding: the problem is not about AVG, but scanning large
amount of data (which AVG often does…)
Principle: Do not scan large amount of data in concurrency
Apply principle wisely 3/23/23 Database Tuning 8 Five basic tuning principles
• Think global y, fix local y
• Partitioning to break bottlenecks
• Start-up costs are high, running costs are low
• Render on the server what is due on the server • Be prepared for trade-offs 3/23/23 Database Tuning 9 Think global y, fix local y
• Disk activity is high, what to do? • Solution 1: buy more disk
• Solution 2: Speedup queries with longest runtime
• Solution 3: Speedup queries with largest share in runtime 3/23/23 Database Tuning 10
Partitioning breaks bottlenecks
• Rarely al parts of a system are saturated • Partitioning strategies
• Divide load over more resources (add lanes)
• Spread load over time (avoid rush hours) 3/23/23 Database Tuning 11
Start-up costs are high, running costs are low • Reading operation • Disk seek is so expense • Continuous read is cheep • Conclusion
• Frequently scanned tables should be serialized sequential y on disk
• Frequent query that projects few columns: vertical y partition table – column based organization 3/23/23 Database Tuning 12 Network latency
• Sending many smal messages vs. sending little big message
• Ex. Sending 1 byte packet is almost as expense as sending 1 KB packet 3/23/23 Database Tuning 13 Query overhead
• Query vs. Store procedure (compiled query)
• Compile often executed queries 3/23/23 Database Tuning 14
Connection overhead from programing languages • Open connection • Significant overhead • Establish connection • User authentication
• Connection caching and pooling
• Do one SELECT and loops over results vs. Doing SELECT in the loop • Lesson learned
• Obtain results with the fewest possible startups 3/23/23 Database Tuning 15 Be prepared for trade-offs
• Making one query faster may slow down other queries
• Index can make certain queries faster, but
• Addition disk space required • Slow down insert, update 3/23/23 Database Tuning 16
Optimizing Schema and Data Types Viet-Trung Tran
School of Information and Communication Technology 1 Outline • Data integration • Introduction • Current approaches • Apache Nifi • Hand-ons Apache Nifi • Data preprocessing • Introduction • Data quality • Data preprocessing steps • Hand-ons Openrefine 2 Choosing Optimal Data Types • Smal er is usual y better
• Faster, less space on disk, in memory, and in the CPU cache • Simple is good
• Eg. integers are cheaper to compare than characters • Avoid NULL if possible
• nullable columns make indexes, index statistics, and value comparisons more complicated 3 Whole and real numbers • Whole Numbers • TINYINT: 8 bits • SMALLINT: 16 bits • MEDIUMINT: 24 bits • INT: 32 bits • or BIGINT: 64 bits
• optional y have the UNSIGNED attribute • Real Numbers
• DECIMAL: storing exact fractional numbers. Eg. DECIMAL(18, 9)
• only when you need exact results for fractional numbers • FLOAT: 32 bits • DOUBLE: 64 bits 4