Last week we got a fresh database server at a customer site to put the SCOM and Orchestrator databases on. Actually two servers in ALways-on setup. Should be a lot better and faster than the “temporary” servers we had been using before.
So, last week we started with the Orchestrator database simply because it is smaller and because there are less people looking in that console.
We had a few lessons learned and one of them is listed as the subject of this post. But I will go through it in a few steps. Don’t try to follow the steps exactly as described, because I am listing them in the order it happened… So lessons learned are in between!
First of all I used the following page on TechNet:
How to Change the Orchestrator Database
http://technet.microsoft.com/en-us/library/hh473578.aspx
This page talks about what to do when moving the database and it looked really easy.
So made a backup of the orchestrator database and restored it on one of the two new nodes. Looked nice. So next was adding it to the always on availability group. There is a wizard for that so clicked through that thing. Needed to specify a share that both servers could reach.
Failed.
After investigation it turns out the service account which SQL db engine is running should have access to the share from both machines, not my account. Lesson learned.
Run through the wizard again to put it in availability group. What is does is to create backups of database and transaction logs and moving it to the second server and restoring them into there and getting it in sync. It will place the database in the availability group.
Failed.
How is this possible? the backups seemed to have worked. However the second database was in a retoring mode somehow. It looked like the synchronization did not start. We had our DBA check this with us and we found that it should be connection related. So we checked the firewall settings and it turned out that the Endpoint ports used for syncronization in the cluster were not listed here. We added the firewall ports and checked the one for SQL 1433 was there as well. Lesson learned.
We turned back and removed the second copy and the entry in the Availability group and we tried it again. This time it worked.
By the way if you do it manually to copy the database to be made highly available to the secondary server, do not forget to use both the full backups of the database and ALSO create a transaction log backup and roll that one into the second one too. Else it will not work.
Alright, now the database was running and highly available. We tried a failover to the other node and that worked.
Meanwhile we had Orchestrator itself stopped.
So ran the steps in the technet article and started the services again. Started the Runbook Designer.
Error.
The following is this error:
The license for System Center 2012 Orchestrator has expired or is invalid. Enter Product Key.
It asks me to enter the Orchestrator license. Huh? Actually when doing that and pressing Enter it also did not want to continue.
This was very irritating. In the end I think we managed to cancel through it and get to the runbooks, most of which would not start. Or would start and within 5 seconds stop again.
:no:
Checking for the errors in the runbook designer it turns out that the connections could not be made to for instance the SCOM server.
We investigated and finally I found this article on the TechNet site:
Migrate Orchestrator Between Environments
http://technet.microsoft.com/en-us/library/hh913929.aspx
Among the very first things I saw was the first step involved in bold:
Back up SQL Server service master key in environment A
Now at first I did not know what this was or what it does, but I am pretty sure this is what our problem relates to! 🙄
It did irritate me that those TechNet pages did not reference each other though! Also as you will see they do not take into account what happens when you have multiple target servers such as when using Always On functionality. ❗
First thing I did was think that it was too late already to get that thing moved over. So I went ahead and created a fresh key. Next thing to do was enter the encrypted info again. So go find the connection settings for connecting to SCOM and e-mail and SCCM and so on and re-enter passwords. I opened up all runbooks and re-selected those connections to be sure. Also the System Center license key.
Restart the Orchestrator management service and we were back up and running.
However, hold on! 88|
A little later this went wrong again and it was due to a manual fail over of the availability group the database was living in. And again we had runbooks failing and again the enter your license key error!
Turns out I missed something important there!
Alright this is how it works:
– the data you enter like passwords and so is stored encrypted in the Orchestrator database.
– a key from the database is used for this.
– however this key can only be read or decrypted using the SQL Server Service Master key!
Yeah and this Master key is not inside the datasbase but is stored in the SQL system tables. Meaning it is related to the Instance and not the database. Now in a normal failover cluster we would not have run into the second time to see the Orchestrator go nuts on this. However with this always on cluster there are two machines with their own SQL instance. And the databases are made in sync. But there are two instances!
If the Server Master key is not the same (which it isn’t) the data can not be descrypted with failover. So I should have used the key from the original server in the first place or just have replaced it after seeing the error. I did not know it could still be done afterwards. And also did not know that in this case
SO what to do:
Open up SQL management studio and run this command to check the master key:
USE master
SELECT * FROM sys.symmetric_keys
(by the way you see here that its not database related by instance related because it is asking the master database for this info and of course its taking it from the system tables).
You will see the servicemasterkey and the Guid belonging to that key.
And of course this was different between the two servers and would have been different from the original database server as well.
So to fix it we need to create a backup of the first key and restore it onto the second server.
First create a directory on the first SQL server to store the backup of the key in. Next run a backup command to backup the key. I have changed the password a bit in this post.
BACKUP SERVICE MASTER KEY
TO FILE = ‘C:backupservice_master_key’
ENCRYPTION BY PASSWORD = ‘3dH85Hhk004GHk2597ghefj5’;
Next I created the same directory on the second server and copied the key bakcup file to it. Open a SQL query on the second server and import the key:
RESTORE SERVICE MASTER KEY
FROM FILE = ‘C:backupservice_master_key’
DECRYPTION BY PASSWORD = ‘3dH85Hhk004GHk2597ghefj5’ FORCE
GO
By the way this last command uses the FORCE option because it will find a database which is already using encrypted data, so we need to force it to use this master key. Make sure you take the key from the working machine and put it on the not working machine though!
Now we check again for the master key and see that they are the same. I restarted SQL Server Service.
Restarted orchestrator services on the orchestrator management server and runbook servers.
Worked! 😀
So there were a few lessons learned. Sure wish I had read the second article earlier as it could have prevented part of this.
A few of these lessons learned came back when we did the SCOM operational database, but that is a story for next time.
Good luck!
Bob Cornelissen