Making a Azure poison queue Slack notifier

2512
Making a Azure poison queue Slack notifier

I'm currently working at a place were we are using queue triggered Webjobs to handle the sending of messages like email and SMS (using Send Grid and Twilio). Using a queue based system for this is great because it allows us to replay any queue messages, should one of the 3rd party's (or our code) fail to send the message. Since we are connecting into 3rd party's you can almost guarantee there's going to be some form of failure. So its always good practice to leverage on this type of architecture to handle the unknown. We have the following setup:

  • Website > Storage Queue > Web Job > Send Grid
  • Website > Storage Queue > Web Job > Twillio

When a failure occurs, queue messages are automatically moved from the message queue into a poison queue, these queues are always suffixed with "poison" (MS really wanted to highlight how toxic your problems are) like so:

  • email - For normal operation
  • email-poison - Messages moved here when a failure occurs
  • sms
  • sms-poison

Gaining visibility of what's in a poison queue is really important in knowing the health of your system. So I embarked upon a task in seeking out an alert setting buried deep somewhere in the Azure portal to help surface any messages going into the poison queue. I knew this would be a metric alert of some kind either in the 'Storage Account', 'Alerts' or perhaps even 'Application Insights' blade. After having spent a while searching for it as well as posting this Stack Overflow question (Link - it wasn't a popular one..), I started doubting whether it even existed! I even tried the search box at the top of the azure dashboard as a last ditch effort, hoping it will provide answers. You think this would exists somewhere (if it does and my eyes have deceived me please do get in touch) or at the very least be visible and easily findable? Alas this was not the case..

So I decided to do something about it, why not have an Azure function that takes a storage account and looks through all the queues to check if any poison queue messages exist. Whilst were at it we could also check if messages are stacking up in the non-poison queues (just in-case a Webjob has been turned off or cant process a certain message), and even provide the content of a problematic queue message. Since our team uses slack for communication I decided to send the notification to Slack. Below are the steps I took:

Step 1 - Setting up slack

Setting up slack is quick and easy, just create a 'poison-queue' channel, and create a new integration in the custom integrations section (Note your gonna have to get admin access to do this (I have provided a link at the bottom of this article as its nested deep in their UI). An integration is essentially a web hook endpoint for us to post JSON data to (I have added a link for Slacks JSON format below too, as well as a message builder to help customise the look and feel).

The picture below show where you can get your web hook URL from.

Step 2 - Create your Azure Function

Since this is not a tutorial on Azure Functions, I'm going to skip going into detail here. Microsoft however have provided some great documentation on this (with pictures!) to help you out. Links are at the end until MS break them. By the way your gonna need a cron expression to define the timeframe for this function to work in, if you hate cron as much as I do worry not! Use my cron expression for a daily sobering alert at 9:00 - 0 0 9 * * *

Step 3 - Create your slack message structure

Next we can create the basic structure needed for our Slack message, expressed as a C# class. My class is actually quite simple and missing quite a few properties, to get a sense of all the customisations Slack offers have a look at the links below.

#r "Newtonsoft.Json"

#load "Attachments.csx"

using Newtonsoft.Json;
using System.Collections.Generic;

public sealed class SlackMessage
{
    public SlackMessage()
    {
        Attachments = new List<Attachments>();
    }

    [JsonProperty("channel")]
    public string Channel { get; set; }

    [JsonProperty("username")]
    public string UserName { get; set; }

    [JsonProperty("text")]
    public string Text { get; set; }

    [JsonProperty("attachments")]
    public List<Attachments> Attachments { get; set; }

    [JsonProperty("icon_emoji")]
    public string Icon
    {
        get { return ":computer:"; }
    }
}

#r "Newtonsoft.Json"

using Newtonsoft.Json;

public class Attachments
{
    [JsonProperty("color")]
    public string Colour { get; set; }

    [JsonProperty("title")]
    public string Title { get; set; }

    [JsonProperty("text")]
    public string Text { get; set; }
}

Remember to create these classes as .csx files for the Azure function to understand them.

Step 4 - Create a slack client to post the message

Now that we have our message structure we can create a class to serialize and post the JSON to Slack using the Webhook created in Step 1, below is the code to do this,

#r "Newtonsoft.Json"
#r "System.Web.Extensions"
#r "System.Web"

#load "SlackMessage.csx"
#load "Attachments.csx"

using System.Net;
using Newtonsoft.Json;
using System.Collections.Specialized;

public class SlackClient
{
    public static readonly string WebHook = @"https://hooks.slack.com/services/XXXXXXXX/XXXXXXXXXXXXXXXXXXXXXXXXXXX";

    public void SendMessage(SlackMessage message)
    {
        string payloadJson = JsonConvert.SerializeObject(message);
        
        using (WebClient client = new WebClient())
        {
            NameValueCollection data = new NameValueCollection();
            data["payload"] = payloadJson;
            client.UploadValues(WebHook, "POST", data);
        }
    }
}

Its good practice to move the Webhook URL into the settings file, for simplicity I have included it into this class.

Step 5 - Queue Checker

Next we need to add code to loop through any connections string we pass it, check all the queues and send messages if we think there's something wrong.

#r "Microsoft.WindowsAzure.Storage"

#load "SlackClient.csx"
#load "SlackMessage.csx"
#load "Attachments.csx"

using System.Collections.Generic;
using System.Linq;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Auth;

public class PoisonQueueChecker
{
    public void CheckPoisonQueues(Dictionary<string, string> storageConnectionStrings)
    {
        var slackClient = new SlackClient();
        var slackMessage = new SlackMessage { Text = "Poison Queue Alerts", Channel = "poison-queue" };

        foreach (var storageConnectionString in storageConnectionStrings)
        {
            var storageCredentials = new StorageCredentials(storageConnectionString.Key, storageConnectionString.Value);
            var storageAccount = new CloudStorageAccount(storageCredentials, true);
            var queueClient = storageAccount.CreateCloudQueueClient();

            var queues = queueClient.ListQueues();
            foreach (var queue in queues)
            {
                queue.FetchAttributes();
                //Gets the total messages in the queue
                var queueCount = queue.ApproximateMessageCount;

                if (queueCount > 0)
                {
                    var isPoisonQueue = queue.Name.EndsWith("poison");
                    var attachment = new Attachments();
                    attachment.Title = $"Queue: {queue.Name}, Message Count: {queueCount}";
                    attachment.Colour = isPoisonQueue ? "danger" : "warning";

                    //Note the peek function will not dequeue the message
                    var message = queue.PeekMessage();
                    attachment.Text = $@"Insertion Time: {message.InsertionTime}, Sample Contents:\n" +
                                        $" {message.AsString}";                        

                    slackMessage.Attachments.Add(attachment);
                }
            }

            //Add a message showing all is well
            if (!slackMessage.Attachments.Any())
            {
                slackMessage.Attachments.Add(new Attachments { Title = "All queues are operational and empty", Colour = "good" });
            }
        }

        slackClient.SendMessage(slackMessage);
    }
}

Step 6 - Being it all together

Final step is to hook up the functions run method like so:

#load "PoisonQueueChecker.csx"

using System;
using System.Collections.Generic;

public static void Run(TimerInfo myTimer, TraceWriter log)
{
    log.Info($"C# Timer trigger function executed at: {DateTime.Now}");

    var storageConnectionStrings = new Dictionary();
    storageConnectionStrings.Add("storagename", "storagekey");

    var poisonQueueChecker = new PoisonQueueChecker();
    poisonQueueChecker.CheckPoisonQueues(storageConnectionStrings);
}

And that's it, 9 O'clock tomorrow you can finally start gaining visibility of those poison queues and start worrying about those dodgy lines of code causing your messages to be poisoned.

Helpful Links

Custom Integrations

https://<<yourslackgroupname>>.slack.com/apps/manage/custom-integrations

Customising your slack message

https://api.slack.com/docs/messages/builder

How to send a slack message to your web hook:

https://api.slack.com/custom-integrations/incoming-webhooks

How to create a azure function:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-first-azure-function

How to code up a azure function:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-reference-csharp