Monday, April 14, 2008

Innodb Performance Optimization Basics

Innodb Performance Optimization Basics

Posted By peter On November 1, 2007 @ 9:17 am In Innodb | 30 Comments

Interviewing people for our [1] Job Openings I like to ask them a basic question - if you have a server with 16GB of RAM which will be dedicated for MySQL with large Innodb database using typical Web workload what settings you would adjust and interestingly enough most people fail to come up with anything reasonable. So I decided to publish the answer I would like to hear extending it with basics of Hardware OS And Application optimization.
I call this Innodb Performance Optimization Basics so these are general guidelines which work well for wide range of applications, though the optimal settings of course depend on the workload.

Hardware
If you have large Innodb database size Memory is paramount. 16G-32G is the cost efficient value these days. From CPU standpoint 2*Dual Core CPUs seems to do very well, while with even just two Quad Core CPUs scalability issues can be observed on many workloads. Though this depends on the application a lot. The third is IO Subsystem - directly attached storage with plenty of spindles and RAID with battery backed up cache is a good bet. Typically you can get 6-8 hard drives in the standard case and often it is enough, while sometimes you may need more. Also note new 2.5″ SAS hard drives. They are tiny but often faster than bigger ones. RAID10 works well for data storage and for read-mostly cases when you still would like some redundancy RAID5 can work pretty well as well but beware of random writes to RAID5.

Operating System
First - run 64bit operating system. We still see people running 32bit Linux on 64bit capable boxes with plenty of memory. Do not do this. If using Linux setup LVM for database directory to get more efficient backup. EXT3 file system works OK in most cases, though if you’re running in particular roadblocks with it try XFS. You can use noatime and nodiratime options if you’re using innodb_file_per_table and a lot of tables though benefit of these is minor. Also make sure you wrestle OS so it would not swap out MySQL out of memory.

MySQL Innodb Settings
The most important ones are:
innodb_buffer_pool_size 70-80% of memory is a safe bet. I set it to 12G on 16GB box.
UPDATE: If you’re looking for more details, check out detailed guide on[2] tuning innodb buffer pool
innodb_log_file_size - This depends on your recovery speed needs but 256M seems to be a good balance between reasonable recovery time and good performance
innodb_log_buffer_size=4M 4M is good for most cases unless you’re piping large blobs to Innodb in this case increase it a bit.
innodb_flush_log_at_trx_commit=2 If you’re not concern about ACID and can loose transactions for last second or two in case of full OS crash than set this value. It can dramatic effect especially on a lot of short write transactions.
innodb_thread_concurrency=8 Even with current Innodb Scalability Fixes having limited concurrency helps. The actual number may be higher or lower depending on your application and default which is 8 is decent start
innodb_flush_method=O_DIRECT Avoid double buffering and reduce swap pressure, in most cases this setting improves performance. Though be careful if you do not have battery backed up RAID cache as when write IO may suffer.
innodb_file_per_table - If you do not have too many tables use this option, so you will not have uncontrolled innodb main tablespace growth which you can’t reclaim. This option was added in MySQL 4.1 and now stable enough to use.

Also check if your application can run in READ-COMMITED isolation mode - if it does - set it to be default as transaction-isolation=READ-COMITTED. This option has some performance benefits, especially in locking in 5.0 and even more to come with MySQL 5.1 and row level replication.

There are bunch of other options you may want to tune but lets focus only on Innodb ones today. You can check about [3] tuning other options here or read one of our [4] MySQL Presentations.

Application tuning for Innodb
Especially when coming from MyISAM background there would be some changes you would like to do with your application. First make sure you’re using transactions when doing updates, both for sake of consistency and to get better performance. Next if your application has any writes be prepared to handle deadlocks which may happen. Third you would like to review your table structure and see how you can get advantage of Innodb properties - clustering by primary key, having primary key in all indexes (so keep primary key short), fast lookups by primary keys (try to use it in joins), large unpacked indexes (try to be easy on indexes).

With these basic innodb performance tunings you will be better of when majority of Innodb users which take MySQL with defaults run it on hardware without battery backed up cache with no OS changes and have no changes done to application which was written keeping MyISAM tables in mind.


30 Comments (Open | Close)

30 Comments To "Innodb Performance Optimization Basics"

#1 Comment By Jeffrey Gilbert On November 1, 2007 @ 11:21 am

I’m happy to say that through reading this site regularly and getting suggestions from the forums I’ve been able to consistently shave off seconds of load time from my site over the past year bringing page load times to an almost instant state. It does take patience in testing new settings, especially when dealing with older slower 32bit hardware, but the payoffs are there and the lessons learned are priceless. My old slow query log was filled with thousands of unsolvable mysteries every day and the slow query time was only set to 10 seconds! Now that I’ve tuned everything up in the settings and have a better understanding of what each setting does in the my.cnf, I have it set to 3 seconds and only find that just around 100-200 queries a day are slower than that (usually because i dont have a failover server during backups which are causing locks that slow things down. working on it!)

I’ve seen great speed improvements using just these tips alone. What I don’t see here which is something that many novice administrators or tuners may not know is that if you set your buffers and settings too high and restart your mysql server, mysql wont instantly complain. What I think happens is it either ignores these settings completely and uses defaults or it uses them, discovers that they dont work for the session, reverts to the defaults or recovers in some other way which is slow. This can seriously impair your performance!

My only wishes for mysql would be that they would allow you to log queries which trigger counters of things like sort_merge_pass, full joins and tmp tables on disk so you could actually better find the queries causing slowdowns or poorly written queries in your applications, AS WELL AS a tool that would allow you to see how your buffers were being used in a visual way rather than just guessing through examining the raw numbers. These two changes would make administration lightyears more advanced than it is now for novice or intermediate developers/admins. Out of 801,000 tmp tables created, only 3,762 of those were on disk. It still bugs me that I can’t just look at a log and find them to fix them. I do have 0 Select_full_join and 0 Sort_merge_passes though finally.

What is most confidence inspiring is thinking about the day when i can take the kid gloves off and run my database on a 64bit machine with a more acceptable amount of ram. After being hamstrung this long with 32bit chips, I can’t wait to see how things perform with the newest tech out there!

#2 Comment By Jay Janssen On November 1, 2007 @ 11:57 am

I have to disagree with the 70-80% of RAM usage for the buffer pool. When I asked Heikki about it at yours and his talk during the conference he admitted that was based on his test box with 1G of RAM. I’ve seen people with 64G of RAM blindly following the 80% rule and only using about 50G of RAM for the buffer poll, leaving 14G unused!

I tend to tell people to leave a few GB for the operating system, and let the buffer pool use the rest. 4G might not be too unreasonable on a 16G box, depending on what else is going on, but I’d probably start with 2G and work up if needed. It’s super important to use O_DIRECT when tuning this, otherwise the OS will snatch up all of your free RAM for fs caching.

#3 Comment By Jay Janssen On November 1, 2007 @ 11:58 am

P.S.

Good post though :) Agrees with much of what I tell people at Yahoo.

#4 Comment By Xaprb On November 1, 2007 @ 12:22 pm

I’d just like to point out that Peter is giving you a sneak peek at the upcoming second edition of High Performance MySQL here. This post is like the cliff notes version of the InnoDB tuning advice in the book. So if you like Peter’s posts, get the book when it comes out.

#5 Comment By Jeremy Cole On November 1, 2007 @ 12:22 pm

Howdy,

Echoing what Jay says, I wouldn’t suggest a percentage for the buffer pool, rather a relatively fixed size, as the percentage doesn’t scale well as memory sizes have grown. I usually go for 14G on a 16G box, potentially reducing it if more than normal amounts of memory are needed for other things (say, a very high number of temp tables).

Regards,

Jeremy

#6 Comment By peter On November 1, 2007 @ 12:47 pm

Jay, Jeremy

I guess “how much to use for Innodb Buffer Pool” is the question answer to which may depend a lot. As I mentioned I provide some basic guidelines in this post which I would like to be simple and 70-80% is a good answer in this case. It works for most typical range of boxes, say 4GB-32GB and it is safe even though you’re not getting the every single penny of performance.

Your advice of leave a bit for MySQL and OS needs and give the rest to Innodb Buffer Pool is good but how one would know how much memory is needed for these ?

Also note not everything may work as you would expect it in theory. For example even with O_DIRECT OS may be swapping out portions of MySQL due to IO pressure which may come from logs, disk based sorts or disk based temporary table.

Another thing you need to keep into account is caching Innodb logs. As IO to Innodb logs is unaligned you better have them fit in the cache otherwise you will be getting read-around-write stalls every so often.

But you’re right of course for 64GB you would want the buffer pool to be significantly higher than 50G

#7 Comment By Keith Murphy On November 1, 2007 @ 12:51 pm

Great posting. Can you do me a favor and expand on this please??? “Also make sure you wrestle OS so it would not swap out MySQL out of memory” I know what you mean by this..just don’t know how to do it..We run 64-bit Linux (debian actually).

thanks,

Keith

#8 Comment By peter On November 1, 2007 @ 1:01 pm

First. Check “si so” columns in VMSTAT - if you have some swap used but there is no swapping activity I would not worry, it is when these values are significant (sometimes in burst) you’re in trouble.

O_DIRECT is a great if you’re using Innodb. You also can use large pages to make MyISAM key buffer and Query Cache not swapable (and get some other benefits) there are some instructions here:
[5] http://www.mysqlperformanceblog.com/2006/06/08/mysql-server-variables-sql-layer-or-storage-engine-specific/

you can use –memlock with varying success - a lot seems to be dependent on Linux Kernel version if it works properly. You can also try to echo 0 > /proc/sys/vm/swappiness though in my experience it does not really work well for preventing swapping.

#9 Comment By peter On November 1, 2007 @ 4:00 pm

Jeffrey,

You should have been looking at another post:
[6] http://www.mysqlperformanceblog.com/2007/10/31/new-patch-for-mysql-performance/

We just created the patch which allow to log query flags with queries so you can see which queries caused on disk temporary tables and which required file sort. Now you just need small script to filter through the log.

We surely will modify data aggregation scripts so they can use this log format.

#10 Comment By Don MacAskill On November 1, 2007 @ 8:25 pm

I’ve been doing all of this stuff for years… or so I thought. :) Buried in there, you say ‘having primary key in all indexes’. Can you elaborate more?

Let’s take a sample table:

CREATE TABLE `users` (
`UserID` smallint(4) unsigned NOT NULL auto_increment,
`Email` varchar(255) NOT NULL,
PRIMARY KEY (`UserID`),
KEY `Email` (`Email`)
) ENGINE=InnoDB;

Are you saying that this would be better when doing queries for UserID based on Email:

CREATE TABLE `users` (
`UserID` smallint(4) unsigned NOT NULL auto_increment,
`Email` varchar(255) NOT NULL,
PRIMARY KEY (`UserID`),
KEY `Email` (`Email`, `UserID`)
) ENGINE=InnoDB;

?

If so, it looks like I (wrongly?) assumed that the Primary Key was always referenced by other indexes. I’ve never seen this be a problem, that I know of, but now I’m wondering…

Thanks!

#11 Comment By Ben Schwarz On November 1, 2007 @ 9:24 pm

These kinds of posts are great; really helpful to get some insight to the mysteries of innodb and mysql tuning.
However, my only gripe is that it all feels a bit like random ‘lets tweak this and see’, rather than putting a test suite behind it with your own hardware.

#12 Comment By peter On November 2, 2007 @ 1:20 am

Ben,

Of course to get last percent of performance out of your system you need to setup benchmarks (which well match your real workload) and do experiments. However you’re better to start somewhere other than default MySQL configuration to get results fast and also you do not always have time to spend a lot of time on this. So view this as starting point for Innodb configuration from which you tune it further.

#13 Comment By peter On November 2, 2007 @ 1:24 am

Don,

What I’m saying is if UserID is primary key in Innodb table the key on (Email) is internally (Email,UserID) because PK value is always stored in the index and rows are stored by it for same key value.

This means the UserID key part of id also can be used for covering index, where clause and I think it is being fixed for filesort now. See this post for examples:
[7] http://www.mysqlperformanceblog.com/2006/10/03/mysql-optimizer-and-innodb-primary-key/

#14 Comment By Mike On November 2, 2007 @ 4:48 am

Are there any rules when specifying a server’s RAM based on the database size? Is 16GB still useful if your database is 6GB? 12GB?

#15 Comment By peter On November 2, 2007 @ 6:14 am

Mike,
Good question. Of course if your database is 6GB and you have 16GB of memory you will likely have more memory than you can efficiently use. You can allocate it as Innodb buffer pool and it will be as “free pages” or you can set buffer pool to lower value, say 7GB and let it be Free on OS side. Over time OS will find something to cache where but in practice that would not be efficient use anyway. If you plan your data size to growth I would set it to higher value so you do not have to revisit it many times adjusting as your database growths.
Of course if there is a mix between MyISAM and Innodb it is other story.

#16 Comment By Don MacAskill On November 2, 2007 @ 8:13 am

Peter,

Oh, great, that’s how I always assumed it was. Whew. Thanks for clarifying!

#17 Pingback By Choosing innodb_buffer_pool_size | MySQL Performance Blog On November 3, 2007 @ 4:41 pm

[…] last post about Innodb Performance Optimization got a lot of comments choosing proper innodb_buffer_pool_size and indeed I oversimplified things a […]

#18 Comment By Charlie Arehart On November 3, 2007 @ 8:25 pm

No one else has commented, so maybe some think it’s self-evident, but I could some casual (new) readers being confused or misled. Where you said, “We still see people running 32bit Linux or 64bit capable boxes with plenty of memory. Do not do this”, I’m assuming you meant “on”, not “or”. :-)

#19 Comment By peter On November 4, 2007 @ 2:51 am

Thanks Charlie,

Fixed now.

#20 Comment By Jeffrey Gilbert On November 4, 2007 @ 7:50 am

peter, re #9

That’s great news!! I didn’t expect to see something materialize so quickly. I will definitely check that out and appreciate the heads up and effort.

best regards
– Jeff

#21 Comment By Matthew Kent On November 5, 2007 @ 2:00 pm

Trivial: but the atime stuff reminded me that nodiratime isn’t required, see [8] http://lwn.net/Articles/245097/

#22 Comment By peter On November 5, 2007 @ 3:27 pm

Thank you Matt,

Honestly I typically did not use it either but I got it somewhere and added is as this is one of the thing which should not hurt.

#23 Pingback By » The Links » roarin’ reporter On November 20, 2007 @ 9:05 pm

[…] InnoDB Performance Optimization Basics […]

#24 Comment By ajay singh On November 28, 2007 @ 11:24 pm

hi,
just wanted to know the role of mmap in innodb and how is it set … also if anyone can help in the same regard with MyISAM….
thank you very much ..
take care…
ajay.

#25 Comment By Kirby On March 4, 2008 @ 6:44 am

First off I love the blog and would like to thank all of those who contribute.

I did want to point out though that the innodb_flush_logs_at_trx_commit setting you have listed is spelled incorrectly. If I’m not mistaken the setting is innodb_flush_log_at_trx_commit (log should not pluralized). Thought I would make an effort to point this out given the recent posting on the about checking MySQL Config files.

Keep up the fantastic work.
Kirby

#26 Comment By peter On March 4, 2008 @ 9:59 am

Kirby,

Thank you - fixed.

#27 Comment By Thiru On March 12, 2008 @ 6:41 am

“We still see people running 32bit Linux on 64bit capable boxes with plenty of memory. Do not do this.”

Could you please explain why.

Thanks,
Thiru.

#28 Comment By Thiru On March 12, 2008 @ 6:45 am

Oh, thank you for the many excellent posts! :)

#29 Comment By peter On March 12, 2008 @ 12:08 pm

If you run 32bit Linux you will be limited to 32bit address space for MySQL which will limit how much memory you can use.

Plus it will be slower for kernel to access large memory.

#30 Comment By Patrick On April 13, 2008 @ 7:50 pm

[..]Of course if there is a mix between MyISAM and Innodb it is other story.[…]
Do you still recommand thoses settings for a 65% INNODB, 35% MyISAM database ? Does MyISAM performance will be affected ? I’ll soon be switching for a MySQL dedicated server with 16Go of Ram, this post is really interesting to me.


Article printed from MySQL Performance Blog: http://www.mysqlperformanceblog.com

URL to article: http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/

URLs in this post:
[1] Job Openings: http://www.mysqlperformanceblog.com/jobs/
[2] tuning innodb buffer pool: http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/
[3] tuning other options: http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/
[4] MySQL Presentations: http://www.mysqlperformanceblog.com/mysql-performance-presentations/
[5] http://www.mysqlperformanceblog.com/2006/06/08/mysql-server-variables-sql-layer-or-storage-engine-specific/: http://www.mysqlperformanceblog.com/2006/06/08/mysql-server-variables-sql-layer-or-storage-engine-sp
ecific/

[6] http://www.mysqlperformanceblog.com/2007/10/31/new-patch-for-mysql-performance/: http://www.mysqlperformanceblog.com/2007/10/31/new-patch-for-mysql-performance/
[7] http://www.mysqlperformanceblog.com/2006/10/03/mysql-optimizer-and-innodb-primary-key/: http://www.mysqlperformanceblog.com/2006/10/03/mysql-optimizer-and-innodb-primary-key/
[8] http://lwn.net/Articles/245097/: http://lwn.net/Articles/245097/

1 comments:

mahakk01 said...

I don't think I can say anything about this topic as I am unable to understand the given topic well. I have few doubt related to this topic. I understand bit about performance optimization concept. If these are basic then advance level have to be more difficult.
electronic signature