Monitor Azure Backup with Log Analytics

Note: This is an entry for the TopQore Blog Wiriting contest for ExpertsLive India. It is written by Gourav Kumar from India.

Introduction

In this blog post we will explore how to monitor Azure long running backup and trigger an alert on them with the help of Log Analytics.

Before going to start, I like to inform you that we are going to unveil new feature of Azure Backup Report and Monitor with some fancy KQL stuff.

So lets’ get started……

Considering all readers in mind as well as ease of understanding I am dividing this into 3 parts.

  1. Enabling of Azure Backup report
  2. Write a KQL query to check to check long running backups
  3. Configure an alert on them (Create a signal)

Starting the first part,

Assuming you are already using Recovery Services vaults to backup your Azure Infra.

The very first step is to login Azure portal and search for Recovery Services vaults on clicking All services blade.

Afterward we could see all available recovery service vault on the Azure portal. So what are we waiting for, click one of them (this should be that for these we want to enable Azure backup report and monitor backup duration).

As we could see I have two Recovery vault available, I am going to pick my first one.

Inside the backup recovery vault we have a blade with name Backup Reports and this is our Hero of the first part.

Hit it off the Backup Reports button and start configuring Backup Report with selecting Diagnostics Settings.

Note: – Since we could see Backup reports is preview feature so it might not be available on some Regions. However as far as I identified this has existence in all regions.

Now we are going to configure backup report and this will be the last step of our first part.

Click on Add diagnostic setting

We can stream backup logs to Azure Event Hub, Storage account and Log Analytics. This could be up to you and your constraints but then again in our case the best fit is Log Analytics so I am going to take this one only.

How we can enable it, just simply follow below steps. It is very easy to configure fill/click below information/tabs.

Name of Report
Tick on Send to Log Analytics (Configure Log Analytics)
In log Section select Azure Backup Report option only.
Hit
Save
 

Its look like below image,

This comes to an end of first part.

Starting my second part for this blog, and this is a KQL query for long running Azure Backups.

AzureDiagnostics

| where TimeGenerated > ago(1d)

| where Category == “AzureBackupReport”

| where OperationName == “Job”

| where todouble(DataTransferredInMB_s)>1

| extend Report_Running_Time_UTC= TimeGenerated

| extend Backup_Job_Start_Time = JobStartDateTime_s

// If we want time in AM or PM format and want to make more readable then uncomment below line by removing //

//| where (Backup_Job_Start_Time contains “AM” or Backup_Job_Start_Time contains “PM”)

| extend DataTransferedGB = todouble(DataTransferredInMB_s)/1024

| extend JobDurationHour = todouble(JobDurationInSecs_s)/3600

| where JobDurationHour > 3

| extend Vault_Name = split(ResourceId, ‘/’)[-1]

| extend Server_Name = split(BackupItemUniqueId_s, ‘;’)[-1]

| project Report_Running_Time_UTC, Backup_Job_Start_Time, SubscriptionId, JobOperation_s, JobStatus_s, DataTransferedGB, JobDurationHour, ResourceGroup, Server_Name, Vault_Name, Level

Since I have given query above and now time to explain them so you all could trust me and run this in your environment.  😀

In this query we are collecting last one-day [where TimeGenerated > ago(1d)] Azure backup data that has taken longer than 3 hours [where JobDurationHour > 3].

You could change 3 to 4, 5, 6 and so on hours as per your feasibility.

Now moving towards closure of this blog post ……………………

Let’s start our last part of blog (part three)

Many thanks to bear with my blog so far,

Now most important thing is how to setup an alert and why we need to set this one????

Taking second question foremost,

“In general we have multiple VMs in Azure and being an DevOps and Azure admin engineer we all have daily tasks and troubleshooting to do in business hours. Due to this we hardly open every tab of Azure everyday therefore we had/have missed unexpected behaviour of Azure backup (long running time is the major one). To make ourselves with sync of Azure portal this is an smart move by me”: P

Now back to work and finishing this up by setting an alert for long running backup:-

Open the Log search of Log analytics workspace that has selected at time of report configuration.

Copy and paste the above query in query tab of log search and hit Run.

If we have server that has taken longer than 3 hours to complete that would appear in output….

Bingo …. I have output

Time to setup an alert for these beasts

Click on New Alert Rule

Now give them some last clicks.

Need to click in condition.
then fill the threshold value with count e.g 1 (I am taking 1 since I want to create alert on every server and result).
Done.

Note: – You could also change Period and Frequency of alert if you wish to.

Once done with all this configure other alert requirement like alert name, action group (to whom this mail supposed to float), webhooks and other conditions.

That’s all, we have made an alert when any Azure VMs’ Azure backup will take longer than 3 hours. We will have an alert notification(s) mail with all information and post this we can look into them and perform any action on it.

Many thanks for reading, very high hopes that you have liked this blog and approach J

Feel free to connect with me and share your thoughts/feedback about this!!!!!