我有一个rails app(4.1.5)由postgres数据库(9.2.13)支持使用pg gem(0.17.1),unicorn服务器(1.1.0)和2个工作进程.
rails app使用sidekiq(2.17.7)运行作业
在某些时候,postgres db进入恢复模式.多个作业抛出以下错误:
PG::ConnectionBad:FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode
数据库恢复了,但作业继续引发以下两个错误:
PG::Error:invalid encoding name: utf8 /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `set_client_encoding' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `configure_connection' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:603:in `reconnect!' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract_adapter.rb:313:in `verify!' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:453:in `block in checkout_and_verify' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activesupport-4.1.5/lib/active_support/callbacks.rb:82:in `run_callbacks'
和
ActiveRecord::ConnectionTimeoutError:could not obtain a database connection within 5.000 seconds (waited 5.000 seconds) /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll' /home/ruby/.rvm/rubies/ruby-2.2.2/lib/ruby/2.2.0/monitor.rb:211:in `mon_synchronize' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize' /home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll'
它看起来像rails通知连接未激活并尝试重置连接.在活动记录的postgresql_adapter.rb中,调用以下方法:
# Close then reopen the connection. def reconnect! super @connection.reset configure_connection end
我的猜测是connection.reset实际上并没有工作,所以当pg gem设置编码(configure_connection方法的第一部分)时,它会通过抛出编码特定错误来掩盖没有连接的事实.
以下是pg gem(.17.1)ext/pg_connection.c/2804中的方法
/* * call-seq: * conn.set_client_encoding( encoding ) * * Sets the client encoding to the _encoding_ String. */ static VALUE pgconn_set_client_encoding(VALUE self, VALUE str) { PGconn *conn = pg_get_pgconn( self ); Check_Type(str, T_STRING); if ( (PQsetClientEncoding(conn, StringValuePtr(str))) == -1 ) { rb_raise(rb_ePGerror, "invalid encoding name: %s",StringValuePtr(str)); } return Qnil; }
因此,如果这些猜测是正确的,为什么连接重置不起作用?
重新启动应用程序会重新建立与数据库的连接,但我希望能够解决此问题的非手动解决方案.