cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2473
Views
0
Helpful
8
Replies

Restore Publisher from Subscriber

cistera.com
Level 1
Level 1

Hi,

 

I had a lab publisher die on me and of course I did not have any backups.  I have read the article on restoring a publisher from a subscriber, but the DRF Local service on the subscriber will not let me add a backup location to it (it is greyed out in the GUI, and the command line times out eventually) ::

 

admin:utils disaster_recovery device add network NAS /Backup 192.168.5.100 root
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
Please enter password to connect to network server 192.168.5.100:*********
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
drfCliMsg: Unable to save Backup Device successfully. Local Agent is not responding. This may be due to Master or Local Agent being down.

 

Anyone know a way around this?  I have restarted the Local DRF service many times.  Nothing in the drf traces except the same error.

 

Thanks in advance

 

-Greg

8 Replies 8

Jaime Valencia
Cisco Employee
Cisco Employee

http://www.cisco.com/c/en/us/td/docs/voice_ip_comm/cucm/drs/8_6_1/drsag861.html

HTH

java

if this helps, please rate

Thanks for the info.  That is indeed the same doc I was following when I started this hunt.  My issue seems to be that the Local DRF service is not responding to anything.

 

admin:utils disaster_recovery device list
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
Device Name        Device Type        Device Path
--------------------------------------------------------------
drfCliMsg: error occurred:Local Agent is not responding. This may be due to Master or Local Agent being down.

Executed command unsuccessfully

admin:utils service restart Cisco DRF Local
 Don't press Ctrl-c while the service is getting RESTARTED.If Service has not Restarted Properly, execute the same Command Again
Service Manager is running
Cisco DRF Local[STOPPING]
Cisco DRF Local[STOPPING]
Commanded Out of Service
Cisco DRF Local[NOTRUNNING]
Service Manager is running
Cisco DRF Local[STARTED]

And when I query again - same result.

 

When I try to add a backup device ::

 

admin:utils disaster_recovery device add network NAS-5100 /Backup 192.168.5.100 root

Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
Please enter password to connect to network server 192.168.5.100:*********
Unable to connect to Master Agent host: cucm-lab-pub, Port: 4040. This may be due to Master or Local Agent being down.
drfCliMsg: Unable to save Backup Device successfully. Local Agent is not responding. This may be due to Master or Local Agent being down.

Yet, the Local DRF is running still.  In case they help, I am attaching logs from the server's drf trace.

 

I am stumped.

I'll tell you what's worked for me on two occasions with an 8.6 deployment after the pub died. 

You install the publisher, you then create a backup of the publisher using DRS. The subscriber at this point does not exist as far as the pub is concerned. After this, you add a subscriber to the Publisher from the relevant menu, and then perform a restore on the Publisher.

It'll ask you for a backup, use the dummy backup which you created after the install. Then select to restore from Subscriber and choose the subscriber. After this is over and you reset the cluster one at a time, you should have the cluster as it was according to the Subscriber's database (which should reflect the functional Publisher until the point when it was faulty).

 

 

Hmmm, that does indeed sound like a good plan.  Can you tell me what state replication needs to be in before I hook up the subscriber (with good data) to the publisher (with no data) so I do not wipe out my subscriber by accident (as I cannot back it up).

 

Thanks!

The publisher won't wipe out the subscriber since they aren't in a cluster. You just need to add the subscriber without restarting the cluster. The restore puts them in a cluster, and so only after the DRS is complete and you restart both nodes does the replication from publisher to subscriber begin.

By the way, I've seen a case where replication didn't even work correctly no matter what kind of force sync we tried after reinstalling the pub. I hope your attempt works out since what might happen is that you'll get a pub with the data from the sub, but you're not able to restore the replication. If that's the case the only option we had was to reinstall the sub after doing a full DRS on both of them, and then the sub could sync with the pub. Hopefully it won't come to that.

Sounds good.  I have already migrated the lab to 10.5, so from this point, this is just an exercise to learn something.  Thanks for the info - I just did not know if when I added the subscriber to the publisher, if it would wipe out the subscriber's data with the blank publisher's data.   I'll try it first with replication stopped on the sub, and if that does not work, I will try it with it enabled and see what happens.  I'll snapshot them both prior so I can restore and find the right mix if I screw it up :)

 

I'll post back with the exact steps so others in the future may find it in the archives.

Problem

When you navigate to the backup page, the Local Agent is not responding. This may be due to Master or Local Agent being down error message appears. This also happens while you attempt to add a back-up device.

Solution

Complete these steps in order to resolve this issue:

Login to the CUCM OS Admin page.

Choose Security > Certificate Management.

Check the serial number for ipsec.pem file.

Ensure that the serial number matches the ipsec-trust.pem file for the subscribers.

Restart the Cisco DRF MAster and DRF Local service in the Publisher.

Activate the TFTP service.

 

http://www.cisco.com/c/en/us/support/docs/voice-unified-communications/unified-communications-manager-version-71/111796-cucm-drf.html

Hi,

 

Thanks for the reply.  I have no communication with the publisher (in fact I have already rebuilt it from scratch like the doc told me), so I can only check the cert on the subs.  From what I can see from the drf traces, there are no SSL errors though.

 

I need to basically be able to add a backup device from the gui or cli on a subscriber.  The pub is dead and not coming back :)  Then I need to run a backup on the sub, and restore it to the newly built pub.  The docs on DRS say it is possible, but I think they all refer to the backup device to already be provisioned in the database.  I am probably hitting a database table that is read only for subscribers - not sure.

 

EDIT:: And just for grins, I recreated the cert on the sub and uploaded it back to itself as trusted.  Same results.