Titles in this page

Thursday, December 16, 2010

Thanks for releasing MySQL 5.5 GA!

  I am excited to see that MySQL 5.5 GA (5.5.8) has been finally released . All of my ex-colleagues at MySQL/Oracle have done amazing jobs. I was also positively surprised that 5.5 was released on schedule:). So far I have felt that Oracle is leading MySQL development and product management very well.
  5.5 has lots of practical features not only for web services providers, but also for enterprise users. Especially I like the following features.

Improved concurrency

  Very often lots of concurrent sessions access to the same table. In such cases, one of MySQL internal global mutexes (LOCK_open) becomes very hot and serious concurrency problems have happened. Increasing table_cache have caused even negative impacts (because MySQL has to do linear search here. See bug#33948 for details). In 5.5, the problem has been fixed. Here is a very simple concurrent PK lookup benchmark result(using mysqlslap). It's good to see that the worst performance line has been significantly improved in 5.5.

  Other global mutexes such as InnoDB kernel mutex are still hot, but overall performance becomes much better.
  It's also great that the limit of 1023 concurrent data-modifying transactions has been raised in 5.5.

4-byte UTF-8

  I'm not sure how many people are aware of this issue, but this is important for users in APAC regions such as China/Japan/Korea. Most of multi-byte characters consume 2 or 3 bytes in UTF-8 per character, but some characters (a few hundred Japanese characters) consume 4 bytes. Prior to MySQL 5.5, 4-byte UTF-8 characters were not supported. If people need to handle these characters, they have to use local character sets such as cp932, gbk5. Now we can use UTF-8 safely (You need to use "utf8mb4" charset).

Audit Plugin Interface

  Lots of users in financial industries have wanted this feature for a long time. They frequently want to audit following information.
- Login/Access Timestamp
- Failed login/access information
- Database username
- Client IP/Hostname or terminal name
- Accessed table/view/other objects name and action
- Full executed query string
- The number of affected/retrieved rows
  MySQL general query log has most of the above information, but it does not have any filtering feature so you have to parse huge general logs, which is not efficient. By writing audit plugins, you can log only specific events.

Semi-Synchronous Replication

  By using Semi-Synchronous replication, the likelihood of the data loss on slaves in case of master crash will be much more reduced. Since this is not fully synchronous approach, you still have risks to lose data in case of crash. But in practice this can be very helpful for less strict users (i.e. web services providers).

"Change buffering" in InnoDB (speeding up DELETEs)

 InnoDB have supported "Insert Buffering" feature for a long time. Insert Buffering can significantly improve insert performance if you have lots of non-unique secondary indexes.
  On the other hand, insert buffering does not have any effect for delete-marking or purging index entries. In general, modifying indexes cause random disk reads when target index leaf blocks are not cached, which significantly slows down response time. That's one of the reasons why bulk deletes(and updates that modifies lots of secondary indexes) were not fast in InnoDB. In 5.5, "Insert Buffering" feature was extended to "Change Buffering", which is effective not only for inserts, but also for deletes. Here is a slide (p.18) from MySQL team that shows the effect of Change Buffering (160x faster response time for bulk deleting 100k rows).

  I am also interested in Performance Schema. I like FILE_SUMMARY_BY_INSTANCE table which enables to fetch per-file disk i/o statistics. If Performance Schema can also be used to fetch per-index statistics (like userstats patch) and can be disabled/enabled dynamically, that would be great.