


















Preview text:
Database tuning Viet-Trung Tran SoICT 3/23/23 Database Tuning 1 Outline • Course organization
• Introduction to database tuning
• Basic principles of tuning 3/23/23 Database Tuning 2 Online resource • Microsoft Teams • Discussion • Exercise • Middle & final grade
• Slides and everything else • (Optional)
https://www.facebook.com/groups/trungtv.students/
• Future career opportunities
• Email: trungtv@soict.hust.edu.vn 3/23/23 Database Tuning 3 Course organization
1. Introduction to database tuning
2. Optimization of schema and data types
3. Reading topic 1: Benchmarking and profiling MySQL server 4. Query tuning 1 5. Query tuning 2
6. Reading topic 2: Advance MySQL features 7. Index tuning 1 8. Index tuning 2
9. Reading topic 3: Replication in MySQL 10. Explain command 11. Concurrency tuning
12. Reading topic 4: High availability and scaling MySQL 13. NoSQL data models
14. Reading topic 5: Backup and recovery 15. Recap 3/23/23 Database Tuning 4 What is database tuning?
• Activity to make database application running faster
• Faster I/O operations: INSERT, SELECT, DELETE, UPDATE,
• And also optimizing storage space, network usage, etc.
• A 5% improvement is important 3/23/23 Database Tuning 5 Tuning parameters
• Everything that make sense • Faster disk • More Ram • Effective index • Good queries
• There is always a cost/trade-off
• Some time cost is low and the benefit very high 3/23/23 Database Tuning 6
Tuning between theory and practice • Practitioner • learning by experiences
• Ex. Never use aggregate functions (AVG) when transaction response time is critical
• Problem: AVG can be ok if less tuples • Theoretician
• Learning by mathematical models
• Ex. Different between indexes
• Problem: rely on ideal assumptions (rare in reality) 3/23/23 Database Tuning 7 Database tuner
• Understand and apply principles
• Understanding: the problem is not about AVG, but scanning large
amount of data (which AVG often does…)
• Principle: Do not scan large amount of data in concurrency
• Apply principle wisely 3/23/23 Database Tuning 8 Five basic tuning principles
• Think global y, fix local y
• Partitioning to break bottlenecks
• Start-up costs are high, running costs are low
• Render on the server what is due on the server • Be prepared for trade-offs 3/23/23 Database Tuning 9 Think global y, fix local y
• Disk activity is high, what to do? • Solution 1: buy more disk
• Solution 2: Speedup queries with longest runtime
• Solution 3: Speedup queries with largest share in runtime 3/23/23 Database Tuning 10
Partitioning breaks bottlenecks
• Rarely al parts of a system are saturated • Partitioning strategies
• Divide load over more resources (add lanes)
• Spread load over time (avoid rush hours) 3/23/23 Database Tuning 11
Start-up costs are high, running costs are low • Reading operation • Disk seek is so expense • Continuous read is cheep • Conclusion
• Frequently scanned tables should be serialized sequential y on disk
• Frequent query that projects few columns: vertical y partition table – column based organization 3/23/23 Database Tuning 12 Network latency
• Sending many smal messages vs. sending little big message
• Ex. Sending 1 byte packet is almost as expense as sending 1 KB packet 3/23/23 Database Tuning 13 Query overhead
• Query vs. Store procedure (compiled query)
• Compile often executed queries 3/23/23 Database Tuning 14
Connection overhead from programing languages • Open connection • Significant overhead • Establish connection • User authentication
• Connection caching and pooling
• Do one SELECT and loops over results vs. Doing SELECT in the loop • Lesson learned
• Obtain results with the fewest possible startups 3/23/23 Database Tuning 15 Be prepared for trade-offs
• Making one query faster may slow down other queries
• Index can make certain queries faster, but
• Addition disk space required • Slow down insert, update 3/23/23 Database Tuning 16
Optimizing Schema and Data Types Viet-Trung Tran
School of Information and Communication Technology 1 Outline • Data integration • Introduction • Current approaches • Apache Nifi • Hand-ons Apache Nifi • Data preprocessing • Introduction • Data quality • Data preprocessing steps • Hand-ons Openrefine 2 Choosing Optimal Data Types • Smal er is usual y better
• Faster, less space on disk, in memory, and in the CPU cache • Simple is good
• Eg. integers are cheaper to compare than characters • Avoid NULL if possible
• nullable columns make indexes, index statistics, and value comparisons more complicated 3 Whole and real numbers • Whole Numbers • TINYINT: 8 bits • SMALLINT: 16 bits • MEDIUMINT: 24 bits • INT: 32 bits • or BIGINT: 64 bits
• optional y have the UNSIGNED attribute • Real Numbers
• DECIMAL: storing exact fractional numbers. Eg. DECIMAL(18, 9)
• only when you need exact results for fractional numbers • FLOAT: 32 bits • DOUBLE: 64 bits 4