Thiết kế và quản trị cơ sở dữ liệu_Thầy Trần Việt Trung| Bài giảng môn thiết kế và quản trị cơ sở dữ liệu| Trường Đại học Bách Khoa Hà Nội
Thiết kế và quản trị cơ sở dữ liệu_Thầy Trần Việt Trung| Bài giảng môn thiết kế và quản trị cơ sở dữ liệu| Trường Đại học Bách Khoa Hà Nội. Tài liệu gồm 585 trang giúp bạn ôn tập và đạt kết quả cao trong kỳ thi sắp tới. Mời bạn đọc đón xem.
Môn: Thiết kế và quản trị cơ sở dữ liệu
Trường: Đại học Bách Khoa Hà Nội
Thông tin:
Tác giả:
Preview text:
Database tuning Viet-Trung Tran SoICT 3/23/23 Database Tuning 1 Outline • Course organization
• Introduction to database tuning
• Basic principles of tuning 3/23/23 Database Tuning 2 Online resource • Microsoft Teams • Discussion • Exercise • Middle & final grade
• Slides and everything else • (Optional)
https://www.facebook.com/groups/trungtv.students/
• Future career opportunities
• Email: trungtv@soict.hust.edu.vn 3/23/23 Database Tuning 3 Course organization
1. Introduction to database tuning
2. Optimization of schema and data types
3. Reading topic 1: Benchmarking and profiling MySQL server 4. Query tuning 1 5. Query tuning 2
6. Reading topic 2: Advance MySQL features 7. Index tuning 1 8. Index tuning 2
9. Reading topic 3: Replication in MySQL 10. Explain command 11. Concurrency tuning
12. Reading topic 4: High availability and scaling MySQL 13. NoSQL data models
14. Reading topic 5: Backup and recovery 15. Recap 3/23/23 Database Tuning 4 What is database tuning?
• Activity to make database application running faster
• Faster I/O operations: INSERT, SELECT, DELETE, UPDATE,
• And also optimizing storage space, network usage, etc.
• A 5% improvement is important 3/23/23 Database Tuning 5 Tuning parameters
• Everything that make sense • Faster disk • More Ram • Effective index • Good queries
• There is always a cost/trade-off
• Some time cost is low and the benefit very high 3/23/23 Database Tuning 6
Tuning between theory and practice • Practitioner • learning by experiences
• Ex. Never use aggregate functions (AVG) when transaction response time is critical
• Problem: AVG can be ok if less tuples • Theoretician
• Learning by mathematical models
• Ex. Different between indexes
• Problem: rely on ideal assumptions (rare in reality) 3/23/23 Database Tuning 7 Database tuner
• Understand and apply principles
• Understanding: the problem is not about AVG, but scanning large
amount of data (which AVG often does…)
• Principle: Do not scan large amount of data in concurrency
• Apply principle wisely 3/23/23 Database Tuning 8 Five basic tuning principles
• Think global y, fix local y
• Partitioning to break bottlenecks
• Start-up costs are high, running costs are low
• Render on the server what is due on the server • Be prepared for trade-offs 3/23/23 Database Tuning 9 Think global y, fix local y
• Disk activity is high, what to do? • Solution 1: buy more disk
• Solution 2: Speedup queries with longest runtime
• Solution 3: Speedup queries with largest share in runtime 3/23/23 Database Tuning 10
Partitioning breaks bottlenecks
• Rarely al parts of a system are saturated • Partitioning strategies
• Divide load over more resources (add lanes)
• Spread load over time (avoid rush hours) 3/23/23 Database Tuning 11
Start-up costs are high, running costs are low • Reading operation • Disk seek is so expense • Continuous read is cheep • Conclusion
• Frequently scanned tables should be serialized sequential y on disk
• Frequent query that projects few columns: vertical y partition table – column based organization 3/23/23 Database Tuning 12 Network latency
• Sending many smal messages vs. sending little big message
• Ex. Sending 1 byte packet is almost as expense as sending 1 KB packet 3/23/23 Database Tuning 13 Query overhead
• Query vs. Store procedure (compiled query)
• Compile often executed queries 3/23/23 Database Tuning 14
Connection overhead from programing languages • Open connection • Significant overhead • Establish connection • User authentication
• Connection caching and pooling
• Do one SELECT and loops over results vs. Doing SELECT in the loop • Lesson learned
• Obtain results with the fewest possible startups 3/23/23 Database Tuning 15 Be prepared for trade-offs
• Making one query faster may slow down other queries
• Index can make certain queries faster, but
• Addition disk space required • Slow down insert, update 3/23/23 Database Tuning 16
Optimizing Schema and Data Types Viet-Trung Tran
School of Information and Communication Technology 1 Outline • Data integration • Introduction • Current approaches • Apache Nifi • Hand-ons Apache Nifi • Data preprocessing • Introduction • Data quality • Data preprocessing steps • Hand-ons Openrefine 2 Choosing Optimal Data Types • Smal er is usual y better
• Faster, less space on disk, in memory, and in the CPU cache • Simple is good
• Eg. integers are cheaper to compare than characters • Avoid NULL if possible
• nullable columns make indexes, index statistics, and value comparisons more complicated 3 Whole and real numbers • Whole Numbers • TINYINT: 8 bits • SMALLINT: 16 bits • MEDIUMINT: 24 bits • INT: 32 bits • or BIGINT: 64 bits
• optional y have the UNSIGNED attribute • Real Numbers
• DECIMAL: storing exact fractional numbers. Eg. DECIMAL(18, 9)
• only when you need exact results for fractional numbers • FLOAT: 32 bits • DOUBLE: 64 bits 4