mysqlpdump

Description

MySQL Parallel Dump

Multi threaded mysqldump is not an utopia any more. mysqlpdump can dump all your tables and databases in parallel so it can be much faster in systems with multiple cpu’s.

It stores each table in a different file by default. It can also generate the dump to stdout although this is not recommended because it can use all the memory in your system if your tables are big.

History

I saw an interesting post on MySQL Performance Blog with some suggestions to improve mysqldump.

Here is my effort to implement some of that suggestions.

Download

Requeriments

Usage

Simplest usage (will save a file for each table):

mysqlpdump.py -u root -p password

Save compressed files (gzip) to /tmp/dumps and pass “–skip-opt” to mysqldump

mysqlpdump.py -u root -p password -d /tmp/dumps/ -g -P “–skip-opt”

Output to stdout and use 20 threads:

mysqlpdump.py -u root -p password -stdout -t 20

Be more “verbose”:

mysqlpdump.py -u root -p password -v

Exclude “mysql” and “test” table from dumping:

mysqlpdump.py -u root -p password -e mysql -e test

Only dump “mysql” table:

mysqlpdump.py -u root -p password -i mysql

Links

Changelog

  • 0.5
    • Compress 00_master_data.sql file if specified
    • bugfix: when it’s called without a terminal or a logged user, it uses “nobody”.
    • bugfix: destination now works with 00_master_data.sql
  • 0.4
    • Made it compatible with python 2.4
    • Can include and exclude specified databases.
  • 0.3
    • Fixed a bug that prevented the tables of being dumped because of a lock
    • Added –master-data option to write “CHANGE MASTER TO ” statement
  • 0.2
    • Store dumps to files directly instead to stdout
    • Can compress files
    • Dump each table in its own file
    • Can pass parameters directly to mysqldump
  • 0.1
    • First version

License

mysqlpdump uses GNU/GPL License.

23 / 05 / 2007
Publicado por fr3nd en General a las 15:37 | Permlink

14 Comentarios »

  1. interessant, pero et recomano que posis la llicéncia dins d’aquesta web, és l’habitual no?

    AvatarComentario por di4blo — 24/5/2007 @ 4:55 pm

  2. I suppose we need Python 2.5 installed, isn’t it ?

    AvatarComentario por Devis — 5/6/2007 @ 1:06 pm

  3. Devis: That’s true. Python 2.5 is necessary because I’m using some functions in the Queue module that are only available in 2.5.

    AvatarComentario por fr3nd — 5/6/2007 @ 1:30 pm

  4. Hi fr3nd, it’s pity because Debian will release 2.5 in 2070… So for the moment I cannot use this script, hope for the future… Do you think it is possible to remove the 2.5 instructions, like queue.join ?
    Anyway, thank you!

    AvatarComentario por Devis — 5/6/2007 @ 1:56 pm

  5. Devis, I’ll try it… I just need to find some safe way to substitute .join()

    Let me check it.

    AvatarComentario por fr3nd — 5/6/2007 @ 2:00 pm

  6. Devis: I made it compatible with python 2.4. It was easier than I thought actually. I’ve also added two more options to specify what databases to dump and what databases not to dump. See usage for examples.

    AvatarComentario por fr3nd — 5/6/2007 @ 3:32 pm

  7. Wonderful, it works! 10M Kudos to you!
    Surely I will follow the script evolution and if I can help just write me :-)
    Thank you!

    AvatarComentario por Devis — 5/6/2007 @ 3:51 pm

  8. Hi fr3nd I’ve put mysqlpdump in a cron job but I am having little problems:

    ——-
    Traceback (most recent call last):
    File “mysqlpdump.py”, line 241, in ?
    main()
    File “mysqlpdump.py”, line 187, in main
    parser.add_option(”-u”, “–user”, action=”store”, dest=”user”, type=”string”, default=os.getlogin(), help=”User for login.”)
    OSError: [Errno 25] Inappropriate ioctl for device
    ——-

    It is caused by “os.getlogin()” in combination with cron, as there isn’t any terminal or logged user.
    For the moment I’ve substitued this with ‘root’ and it works for me.

    AvatarComentario por Devis — 6/6/2007 @ 12:37 pm

  9. Devis: version 0.5 fixes that bug. Thanks for the bug report!

    AvatarComentario por fr3nd — 7/6/2007 @ 9:26 am

  10. This sounds very interesting and promising; but I think it will not be possible to create a consistent dump that way? Or how should locking be handled?

    Comentario por danielj — 13/9/2007 @ 7:09 am

  11. Hi danielj, to create a consistent dump you need to lock *all* tables before the dump.

    AvatarComentario por Devis — 13/9/2007 @ 7:19 am

  12. Thank you !

    Too bad MySQL does not have support for several connections sharing same transactional context so the only way you can get consistent backup is setting low priority updates and locking all the tables. As you have multiple threads you may have no luck locking all of them otherwise.

    Comentario por Peter Zaitsev — 13/9/2007 @ 9:52 am

  13. Thanks for the tool! I love it. I used it and blogged about it yesterday, and then today I wrote a restore tool for it (http://mysql-ha.com/2007/09/13/mysqlprestore-for-parallel-restores/)
    Please feel free to grab the file and incorporate it however you like. I tried to do it in such a way that the two could be merged pretty well, I think.

    Comentario por Monty Taylor — 13/9/2007 @ 6:56 pm

  14. This tool automatically locks all tables before starting the dump, so there is no need to lock them manually. It’ll create a consistent dump.

    About the restore tool, I don’t think it’s needed… Restoring is as easy as doing:

    cat *.sql | mysql

    AvatarComentario por fr3nd — 14/9/2007 @ 8:24 am

RSS feed para los comentarios de este post.

Dejar comentario