Event-Driven Datacenter with vRO

Learn how to build event-driven datacenter with vRO

I like the concept of the Self-Healing, Event-Driven datacenter. The fundamental principle involves automatically responding to predefined events. For instance, suppose we have a logging system (the specific implementation is inconsequential) capable of detecting a predefined issue and initiating an HTTP POST request to an external system. The external remediation system will receive this HTTP call and subsequently take appropriate action. By adhering to this straightforward logic, we can automate a wide range of tasks.

Numerous excellent tools are available to help us achieve this objective. However, many of them are tied to a specific piece of software or vendor. What if we wanted to create a general solution that can cover various use cases? Let’s explore how we can implement this concept using VMware Aria Orchestrator and improve its reliability and scalability.

The goals:

  • Implement an automated solution for day-to-day occurrences.
  • Proactively address security concerns with automatic remediation of potential threats.
  • Seamlessly handle a multitude of events simultaneously.
  • Empower our workflows with code-driven remediation capabilities.

To reach our goals, we will need to enhance our infrastructure with additional components, including webhook servers capable of receiving HTTP requests and passing it to the message queue. The purpose of the message queue is straightforward: with potentially dozens or even hundreds of alerts co-occurring, we want to ensure that all are noticed. This brings us to the opportunity to leverage a vRO feature called AMQP Policy.

A use case:

We would like to disconnect the VM from the network, if there is a specific event was occurring.

  1. The event was triggered in the Virtual Machine (VM).
  2. The event was sent to the logging server.
  3. The logging server then sent a POST API call to the webhook server, which contained all the necessary details (like a VM’s hostname).
  4. The webhook server validated the received data and sent the message to the queue.
  5. In the queue, the message is routed to the specific destination based on the key received from the event (allows multiple types of events to be supported and sent to different queues).
  6. The orchestrator monitored the queue and retrieved the message.
  7. The orchestrator then executed the workflow, which involved disconnecting the VM from the network.

The solution

VM

The implementation method will vary depending on the chosen solution for monitoring the OS event. However, in general, most modern centralized logging solutions gather logs from the guest OS (both Windows and Linux) either using a built-in logging solution or by installing certain agents.

Logging

A central logging system enables the creation of alerts based on specific criteria. Therefore, you can define your alerts based on your needs.

Webhook server

This task may require finding an existing solution or creating your own. Luckily, there are numerous ways to do this nowadays. A straightforward solution could be to use FastAPI. It is possible to make a small container to run an API server that receives HTTP requests and passes them to the queue using the AMQP protocol.

Message Queue

One of the most popular message queue services is RabbitMQ. It is fully compatible with vRO. Therefore, we’ll use it as a queue server.

Deploy a RabbitMQ server and create a new queue myRemoteQueue with all the defaults.

vRO

Now, we can start to configure our vRO:

  1. Broker configuration
  2. Remediation implementation
  3. AMQP policy configuration

Read the full story

Sign up now to read the full story and get access to all members-only posts.

Subscribe
Already have an account? Sign in
Great! Next, complete checkout for full access to CloudDepth.
Welcome back! You've successfully signed in.
You've successfully subscribed to CloudDepth.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.

This work by Leonid Belenkiy is licensed under Creative Commons Attribution 4.0 International