By end of this month, I am completing 5 years with OmniTI. Before joining OmniTI, I was an Oracle DBA and worked mostly with closed source databases. I am grateful & fortunate that Theo and Robert provided me opportunity and guidance to work and contribute with open source technologies and most importantly an open source database PostgreSQL ! At OmniTI, if you don't have hesitation to learn new technlogy, you will have smooth learning experience alongside working with your known technology for the clients. Most importantly, you can have access to technology leaders in the office. Ironically, I never worked for single company longer than a year before joining OmniTI in my career but now I am looking forward to learn and contribute to FOSS community working with OmniTI.
Since joining, I haven't spent a day without learning a new thing . Are you bored at routine work? Come join OmniTI's adventure, here I work!
Bravo to OmniTI & PosgreSQL and FOSS community!!
Denish Patel
Tuesday, February 14, 2012
Thursday, January 5, 2012
What is pg_extractor ?
In my recent blog post, I wrote about PostgreSQL DBA Handyman toolset. In the list of tools, getddl is one of them. If you are using getddl to get DDL schema and track the daily changes in SVN for production databases, you should consider moving that process to use pg_extractor instead. pg_extractor is the more advance and robust tool for extracting schema as well data using pg_dump. Keith Fiske, an author of the tool, described tool in detail in his blog post. Thanks to Keith for making the schema extraction tool more robust and taking it to next level !
Hopefully, it will help you to have more control over your database in smarter way!
Hopefully, it will help you to have more control over your database in smarter way!
Tuesday, December 6, 2011
Faster & Better VACUUM FULL
In presentation, I discussed in detail about Bloat issue in PostgreSQL and methods to remove Bloat from the tables/indexes. Now a days, PostgreSQL9.0 is common and the widely used version for the production use and it's vital to remind about changes in most important bloat removal tool called "VACUUM FULL". Until PostgreSQL 9.0, VACUUM FULL was tardy and DBA always stayed away from it and used CLUSTER instead. (Checkout presentation for difference between CLUSTER vs VACUUM FULL)
The VACUUM FULL statement recovers free space from a table to reduce its size from bloated tables, mostly when VACUUM itself hasn't been run frequently enough. Before PostgreSQL 9.0 , it was tardy and slow because of the way it was executed: records were read and moved one by one from their source block to a block closer to the beginning of the table. Once the end of the table was emptied, this empty part was removed. This method was very inefficient: moving records one by one creates a lot of random IO. Additionally, during this reorganization, indexes had to be maintained, making everything even more costly, and fragmenting indexes. It was therefore advised to reindex a table just after a VACUUM FULL.
Now, the VACUUM FULL statement, as of PostgreSQL 9.0, creates a new table from the current one, copying all the records sequentially. Once all records are copied, index are created back, and the old table is destroyed and replaced. This has the advantage of being much faster. VACUUM FULL still needs an EXCLUSIVE LOCK during entire operation. The only drawback of this method compared to the old one, is that VACUUM FULL can use as much as two times the size of the table and indexes on disk, as it is creating a new versions of it.
Let's compare run-time of VACUUM FULL on PostgreSQL 8.4 vs PostgreSQL 9.0
postgres=# create table vacuumtest(id int primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "vacuumtest_pkey" for table "vacuumtest"
CREATE TABLE
postgres=# insert into vacuumtest select generate_series(1,10000000);
INSERT 0 10000000
postgres=# delete from vacuumtest where id%4=0;
DELETE 2500000
postgres=# vacuum vacuumtest;
VACUUM
On 8.4:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 61418.197 ms
postgres=# reindex table vacuumtest;
REINDEX
Time: 12212.815 ms
On 9.0:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 32640.714 ms
Above results show that VACCUM FULL on PostgreSQL 9.0 is way faster than previous versions. Moreover, VACUUM FULL has couple of advantages over CLUSTER : it's faster than CLUSTER because it doesn't have to build new table using ORDER by clause & you can run VACUUM FULL on tables on which there isn't any index.
If you are running any bloat removal tool on the production database, i would recommend you to revisit vacuum parameters and tighten them up a little that makes regular vacuum to run more frequent so it will help to reduce frequency of running more intrusive bloat removal tools!!
The VACUUM FULL statement recovers free space from a table to reduce its size from bloated tables, mostly when VACUUM itself hasn't been run frequently enough. Before PostgreSQL 9.0 , it was tardy and slow because of the way it was executed: records were read and moved one by one from their source block to a block closer to the beginning of the table. Once the end of the table was emptied, this empty part was removed. This method was very inefficient: moving records one by one creates a lot of random IO. Additionally, during this reorganization, indexes had to be maintained, making everything even more costly, and fragmenting indexes. It was therefore advised to reindex a table just after a VACUUM FULL.
Now, the VACUUM FULL statement, as of PostgreSQL 9.0, creates a new table from the current one, copying all the records sequentially. Once all records are copied, index are created back, and the old table is destroyed and replaced. This has the advantage of being much faster. VACUUM FULL still needs an EXCLUSIVE LOCK during entire operation. The only drawback of this method compared to the old one, is that VACUUM FULL can use as much as two times the size of the table and indexes on disk, as it is creating a new versions of it.
Let's compare run-time of VACUUM FULL on PostgreSQL 8.4 vs PostgreSQL 9.0
postgres=# create table vacuumtest(id int primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "vacuumtest_pkey" for table "vacuumtest"
CREATE TABLE
postgres=# insert into vacuumtest select generate_series(1,10000000);
INSERT 0 10000000
postgres=# delete from vacuumtest where id%4=0;
DELETE 2500000
postgres=# vacuum vacuumtest;
VACUUM
On 8.4:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 61418.197 ms
postgres=# reindex table vacuumtest;
REINDEX
Time: 12212.815 ms
On 9.0:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 32640.714 ms
Above results show that VACCUM FULL on PostgreSQL 9.0 is way faster than previous versions. Moreover, VACUUM FULL has couple of advantages over CLUSTER : it's faster than CLUSTER because it doesn't have to build new table using ORDER by clause & you can run VACUUM FULL on tables on which there isn't any index.
If you are running any bloat removal tool on the production database, i would recommend you to revisit vacuum parameters and tighten them up a little that makes regular vacuum to run more frequent so it will help to reduce frequency of running more intrusive bloat removal tools!!
Friday, November 11, 2011
PostgreSQL Handyman Toolset
A PostgreSQL handyman is a person skilled at a wide range of maintenance and repairs, typically around the PostgreSQL database system. If you are working with PostgreSQL and you have right tools for the job, you can easily accomplish that job efficiently and most importantly on time. In real world, if you want to dig into land and you don't have tools then you are failed at job before you start the job! Having tools for any job is most important factor for job success. While working at OmniTI, I tested and used variety of tools to make my work life easy and efficient.
PostgreSQL community is thriving by people, those helps to build new tools, give it back to community for re-use and providing helping hands to test the tools. Here are some of my suggested list of tools which could help to finish your job efficiently and on time!!
1. Tasty Treats for PostgreSQL project includes plenty of PostgreSQL management tools , which is one of the projects of OmniTI's Lab repository.
Some of the tools includes:
2. OMNIpitr Advanced WAL File Management Tools for PostgreSQL. This tool helps to handle HA processes efficiently to manage wal shipping, restore & removal , online backup on slave and/or master.
3. bucardo.org also includes mission critial tools:
4. repmgr: it allows you to monitor and manage high availability part of PostgreSQL databases (9.0+) installation.
5. pg_reorg: it allows you to rebuild Bloated tables online. Review these slides to understand how it works.
6. RubyRep : Components of this tool set can help you compare and sync two tables or even two databases.
I would love to know tools used by you for managing PostgreSQL database server which are missing from my PostgreSQL handyman too-lset !!
Hopefully, these tools will make your next task easy!! If you find bug, don't hesitate to report to send out email to relevant tool maintainer with error or suggestion or patch with bug fix.
DISCLAIMER: use any of listed tool on production at your own risk.
PostgreSQL community is thriving by people, those helps to build new tools, give it back to community for re-use and providing helping hands to test the tools. Here are some of my suggested list of tools which could help to finish your job efficiently and on time!!
1. Tasty Treats for PostgreSQL project includes plenty of PostgreSQL management tools , which is one of the projects of OmniTI's Lab repository.
Some of the tools includes:
- zbackup
- getddl
- pgsniff
- table growth report
- quickstats
- pg_log monitor
- bloat reports
- pgtruss
- system monitoring
- compact table
- blocked queries
- Curo
2. OMNIpitr Advanced WAL File Management Tools for PostgreSQL. This tool helps to handle HA processes efficiently to manage wal shipping, restore & removal , online backup on slave and/or master.
3. bucardo.org also includes mission critial tools:
- Bucardo
- Asynchronous PostgreSQL Replication System
- check_postgres
- Nagios-Friendly PostgreSQL Database Checking Script
- pgsi
- Wiki-Ready Query Reports
- tail_n_mail
- Monitor log files
- boxinfo
- Gather system information
4. repmgr: it allows you to monitor and manage high availability part of PostgreSQL databases (9.0+) installation.
5. pg_reorg: it allows you to rebuild Bloated tables online. Review these slides to understand how it works.
6. RubyRep : Components of this tool set can help you compare and sync two tables or even two databases.
I would love to know tools used by you for managing PostgreSQL database server which are missing from my PostgreSQL handyman too-lset !!
Hopefully, these tools will make your next task easy!! If you find bug, don't hesitate to report to send out email to relevant tool maintainer with error or suggestion or patch with bug fix.
DISCLAIMER: use any of listed tool on production at your own risk.
Wednesday, September 14, 2011
The Scalability & Performance Conference : Surge 2011
The Scalability and Performance covers most of the web scale issues, innovations and evolutions in technology. As per my knowledge, Surge is the only conference that gives the coverage to the issues and solutions concerning Scalability & Performance without any bias on technology that we often notice at most other conferences. For example , if you are looking from Database perspective, it covers topic ranges from RDBMS to NoSQL and talks about problems solved by the technology and new possible problems introduced by new technology.
I attended Surge 2010 and listed attended sessions on my last year post. I am going to attend Surge again. Surge 2010 was the best conference attendance experience for me and others. By looking at Sponsor list (Google is one of them!), this year (Sept 28-30 , 2011) conference is going to be far bigger and better than last year and don't forget to checkout industry leader Speakers list & Hack Day/Training Schedule !! If you haven't registered , you still have chance to do so.
Surge 2011 will be surge of the knowledge ! See you soon !
I attended Surge 2010 and listed attended sessions on my last year post. I am going to attend Surge again. Surge 2010 was the best conference attendance experience for me and others. By looking at Sponsor list (Google is one of them!), this year (Sept 28-30 , 2011) conference is going to be far bigger and better than last year and don't forget to checkout industry leader Speakers list & Hack Day/Training Schedule !! If you haven't registered , you still have chance to do so.
Surge 2011 will be surge of the knowledge ! See you soon !
Thursday, May 5, 2011
When was my database started ?
In the database world, it's fair to ask question to database that when have you started? In Oracle world, it can be easily find out by querying system level view V$INSTANCE. In PostgreSQL world, until today, I wasn't able to answer that question without checking database log files. Today, I asked this question to OmniTi's database operations irc channel and got answer in 20 minutes. I regret that I should have asked it earlier but anyways, now I know that in PostgreSQL it's possible using this query.
omniti=# SELECT (pg_stat_file( 'postmaster.pid')).modification;
modification
------------------------
2010-08-13 15:37:14-04
If someone complains that database was restarted , it would be easy as DBA to query and verify it. I hope this query will be helpful to you during day-to-day operations!
Friday, March 25, 2011
P90X your database!! presentation slides
I just gave talk on "P9ox your database!!" .. here are the slides..
Subscribe to:
Posts (Atom)