Always have a backup! You might need this if you are working on a restore as well.
This blog post is related to #PGSQLPhriday. The original post by Ads can be seen here.
Describe how you do backups for your PostgreSQL databases.
Which tool(s) are you using, where do you store backups, how often do you do backups?
Are there any recommendations you can give the reader how to improve their backups?
Any lesser known features in your favorite backup tool?
Any new and cool features in a recently released version?
Bonus question: Is pg_dump a backup tool?
If you go back to ads’ original blog post, he asks, “Backups? Do you need that?”. When I read this as the topic for #PGSQLPhriday, I thought to myself this guy is crazy, he must have had too much ice cream if he believes we don’t need backups! But then I realized he was just using that as a headline to get me reading more. Thank you ads for helping my heart to skip a beat a little. I owe you a good ice cream at the next event for that one.
Let’s Talk Backups
I’m still fairly new to Postgres having only started about 5 years ago. I started with Pg9.6 and we quickly moved that environment to pg10. When I arrived at this company, pg_dump was the only backup we were using. The DB at that time was still around 50GB, it was reasonable to do the backups in a timeframe that worked for us. A dump was done every night and stored off to another server. After some time we started to test Pg_basebackup. This allowed us to full server backup each night. It was a huge improvement as far as speed and ability to handle much larger data sizes.
The downside to this approach is that you have to restore the full server directory to another server or over the top of your existing server. In the world of the DBA, one of the more frequent things you do is restore just a table that gets dropped by a DBA, Engineer, Sysadmin, or even the CTO sometimes. When you have to simply restore one table the base backup makes that pretty hard since you HAVE to restore the entire Server/Directory to another location.
Now don’t get me wrong it’s not easy to do that with PG_dump either, but with Dump at least you can just point that to your same server and just a new DB name. That means it’s usually a lot faster to restore just that table. Also with the Dump, there is a good chance it will restore the data quickly. If you already have the structure of the table you can simply restore the data to the original database and not wait for things like Indexes or keys to be restored from the dump.
Regardless of the tools or the Database (SQL Server, PG, MySQL, ES) here are some key items I always make sure of in any of my Database situations
- Test all Backups! You have NOT backed up if you don’t test the backup by restoring it. This doesn’t have to be every file but you should have an audit restore running every X days/week that can test a backup file for you. Just do it!
- Make sure others can backup/restore and write down the Backup and Restore procedure. Even if you have a seasoned team of DBA’s you never know when the whole group will be unavailable. Sometimes you need the person that knows the least amount about restoring data to be able to restore the data. We always maintain documentation on how to restore the DB and make sure several teams in the company have both the permissions and the understanding of how to do it.
- Know how you are planning to restore the DB. 99.99999% of the time you are not going to restore over the top of the DB/Server you are already using. Most of the time you are going to restore to a new server/new system/different DB. Think about this when you are designing your backup and restore process. A lot of tools/systems will talk about how quickly they can put the whole drive back in place from just a few hours ago. But rarely does a customer want everything back to a few hours ago. Typically it’s just one aspect of the Database that’s been impacted and you don’t want to roll back all the other pieces of the system. When designing your strategy you should be thinking about what a restore looks like and what you typically will use it for.
- Don’t just backup the DATA! Many database systems have configurations and settings that are key to making the database work properly. Perhaps you have a custom configuration DB in your system that you need to make the application work. In PG We would back up the PG conf file as well with our backup scripts. This allowed us to restore the server if needed. You don’t need to back it up as frequently as the data files but it should still get backed up.
- Stop backing up to the server! Yes, I still see this at customer locations. It’s perfectly fine to write the backup to a local disk and THEN synchronize it to the cloud, s3, or tape drive(they still exist). Just make sure you are doing that step and have alerts if that step fails. If you go back to #1 in this post you should run your test restore from your cloud backups, not off the local disk. If you only have backups on a local server and a local drive even if it’s in a data center do yourself a favor and get an s3 bucket and synchronize the backups to the bucket. Recently I had a customer not be able to retrieve the data from the data center because the data center got hit with ransomware. They locked everything down and wouldn’t let anything out. If we had a backup in s3 or on the cloud we would have had a copy and could have kept moving forward and rebuilding. This almost put the company out of business but luckily we found a backup outside the data center that was a few months old.
I have so many more stories I could tell about backups and restore! I will keep this post relatively short and dedicate a future post related just to these items. I hope this list will help you to build your backup strategy. I would love to hear about issues you have had with backups and restores!