Try OpenEdge Now
skip to main content
User Guide
Moving From Failure to Recovery with OpenEdge Replication : Possible types of failure : Source failure recovery after losing connection
 

Source failure recovery after losing connection

When the OpenEdge Replication server loses connection with one or more OpenEdge Replication agents, the OpenEdge Replication server tries to contact the OpenEdge Replication agents and establish connection for an amount of time determined by the connect-timeout value set in the OpenEdge Replication server properties file.
The OpenEdge Replication server does the following:
1. The OpenEdge Replication server recognizes that there has been an agent failure. The server places itself into a state that allows continuous database activity, as if OpenEdge Replication were not running.
2. The OpenEdge Replication server tries to reconnect to OpenEdge Replication agents for a set amount of time.
Source database activity by clients is still allowed unless synchronous replication is being used or schema updates are being performed by a process.
3. If the OpenEdge Replication server is able to reconnect to the OpenEdge Replication agent, it again begins processing AI blocks from the database. When it gets within ten AI blocks of the last AI block written, the OpenEdge Replication server temporarily stalls normal database activity and completes the synchronization process.
Schema updates are not allowed while the OpenEdge Replication server is performing synchronization. If schema updates are being performed when failure recovery synchronization begins, source database updates will block until failure recovery completes.
When synchronous replication is being used, source database activity cannot continue without a connection to the agent.
4. When synchronization is completed, the OpenEdge Replication server reinserts itself back into the AI block write process. The database will be unstalled, allowing normal database activity and replication activity to continue.
If the OpenEdge Replication server is unable to reconnect to all agents or to the critical agent in the configured connect-timeout period, the OpenEdge Replication server will terminate, and source database activity will continue. In other words, if there is no critical agent, the server must be able to reconnect to all agents; or it will terminate. If one agent is specified as the critical agent, the server will continue if it can reconnect to it. When source database activity continues while the OpenEdge Replication server is not running, be sure that there is enough AI extent space to handle all database activity until the OpenEdge Replication server is restarted and replication continues.
There is a possibility when failure recovery is being performed and synchronization takes place that the OpenEdge Replication server might not catch up to the database. During this time, all target databases are not up to date with the source.