Automating alert response with Azure Security Center and Azure Logic Apps

Responding a security event is the core practice in the modern security frameworks. After a potential threat was detected, it is time to act. The shorter the response time is the less damage an attacker can deal to your cloud.

Detection in Azure

Azure Security Center in the Standard pricing tier ($15/VM node per month) comes with automated detection mechanisms. The core detection capability is built around real-time traffic and system logs parsing and applying machine learning algorithms to it:

security-center-detection-capabilities-fig1

A single dashboard can be found under Security Center -> Security Alerts blade and also on the main page of the Security Center:

alertsdetection

Alerts represent single or multiple security events of the same nature and time span. Incidents are created from multiple alerts which are classified as related to each other – for example, an attacker runs a malicious script, extracts local password hashes and cleans the event log. This sequence of action will generate one incident.

Incident forensics

Incidents can be investigated in a forensics tool Investigation Dashboard (in the preview, as of May 2018). This tool draws the relationships between alerts, events that caused the alert, affected resources, users. It also can help when reconstructing lateral movements of attackers within the network.

investigation.PNG

Automated response

Incident forensics represents a post-mortem investigation. An adversary event did happen, and the attackers have already done some damage to the enterprise. We don’t have to wait until malicious actors finish their job – we can start acting right after getting the first signals about the intrusion. Alerts are generated by Azure in real-time, and recently Security Center got a powerful integration with Azure Logic Apps.

Logic Apps in Azure represent workflows of actions with pre-built triggers, conditions, and actions which include a wide range of both native and 3-rd components. For example, your logic app can listen to RSS feed and automatically tweet once new pages are published to the feed. Or, run a custom powershell through Azure Automation.

One of the recent additions to Logic Apps – Security Center triggers. This feature turns Azure security alerts into the powerful tool for fighting attackers once they trip a wire.

You can Security-related Azure Logic Apps under Security Center -> Playbooks (Preview).

Building the logic

After adding a new playbook, a user gets presented with Loic App Designer. The trigger is pre-populated – When a response to Azure Security Center alert is triggered. Once we get an alert, the playbook is executed. Then, we add a condition – there are multiple parameters that the alert arrives with. Let’s take “Alert Severity” and set the condition to High:

trigger

Other alert parameters include Confidence Level, Alert Body, Name, Start or End Time and many more. The range is quite broad which makes it possible to generate very specific responses to almost any imaginable event.

Now, if the condition is TRUE – Alert Severity is High, we want to contain the threat. One of the ways to do so is to isolate a VM under attack. Let’s say, assign it to a different Network Security Group which has no connection to the internal company network or some of its segments. To do it, we would need to get a VM name from the alert and run some Azure Powershell performing the NSG re-assignment.

Creating the Automation Job

Now, we can go to Azure Automation and create an Automation Job for our needs. This can be done through the blades Automation Accounts -> Runbooks -> Add a runbook. As Runbook type, choose “Powershell”.

Then, we insert the following code:

Param(
[string]$VMName
)

$connectionName = "AzureRunAsConnection"

try
{
# Get the connection "AzureRunAsConnection "
  $servicePrincipalConnection=Get-AutomationConnection -Name $connectionName

  Add-AzureRmAccount `
    -ServicePrincipal `
    -TenantId $servicePrincipalConnection.TenantId `
    -ApplicationId $servicePrincipalConnection.ApplicationId `
    -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
}

catch {
  if (!$servicePrincipalConnection)
  {
     $ErrorMessage = "Connection $connectionName not found."
     throw $ErrorMessage
} else {
  Write-Error -Message $_.Exception
  throw $_.Exception
  }
}

# Get VM object
$vm = Get-AzureRmVM -Name $VMName -ResourceGroupName AzureBootcamp
# Get NIC
$Nic = Get-AzureRmNetworkInterface -ResourceGroupName AzureBootcamp | Where-Object {$_.VirtualMachine.Id -eq $vm.Id}
# Change Network Security group to IsolatedNetworkNSG
$Nic.NetworkSecurityGroup = Get-AzureRmNetworkSecurityGroup -ResourceGroupName AzureBootcamp -Name "IsolatedNetwork-NSG"
# Apply changes
Set-AzureRmNetworkInterface -NetworkInterface $Nic

This code gets the VMName as a parameter, authenticates to your Azure account with Azure Run-As connection (requires preliminary configuration). Then, it get’s VM’s NIC and assigns it to the security group “IsolatedNetwork-NSG”. Save the automation runbook with name IsolateVM, for instance, and don’t forget to publish the changes after editing Powershell.

Putting it all together

The last step, adding the action to the Azure Logic App we-ve been building. Select “Azure Automation – Create job” and point it to the IsolateVM automation book.

logicapptrue.PNG

Here, we specified “Host Name” as Runbook parameter (notice, it automatically picked up parameter name VMName that we created int he runbook).

Save the logic – and this is it. Once an alert is generated a VM is expelled to the isolated security group with limited access.

Testing and tuning the playbook

To test this integration before an actual event happens, go to any of previous events in Security Center – Security Alerts (you can generate them, for example, by trying to downloadMimikatz from Github), click on the event, then click on “View playbooks” button. In the new window find your Logic app workflow and press “Run” under “Run playbook”:

runpalybook

This will send exactly same trigger as this alert would have done. From the playbook run window or Run history, you will be presented with a static view similar to Logic App Designer with the only difference that it contains the logic path that was taken in this run:

logicexecution

Actual inputs that were submitted with the trigger can be viewed by expanding “When a response to an Azure Security Center alert is triggered” section.

alertdescription

The Azure Security Center alerts integration with Logic Apps provides limitless capabilities not only for informing about detections (via email, Slack, Skype) but also for an automated response to potential attacks with auto-tuning cloud infrastructure and isolating the threat, a show in the example.

Have fun building your own playbooks and fighting the threats before they become incidents.

Stay secure!

Developers and IT-Pros: who will be left in the past? How to work together efficiently

devvsitpro

Last weekend, I was invited to Global Azure Bootcamp in Linköping where among the other activities also participated in the panel discussion about the future of Developers and IT-Pros collaboration and ways to make it more efficient. It was a brilliant and sharp discussion the main points of which (or rather my view on them:) I would like to share in this post.

IT-Pros

To start with – who are these mysterious creatures? I like the term “IT-Pro” since it covers more than more common dev counterparts – Operations or Ops. It is important to understand that Ops are related to the infrastructure people, while from my perspective, IT Pros include everyone who is NOT a developer. Such brave folks like Security, DBA, Cloud Architects, Consultants, UX designers (why not?!) and many more. Why define a special group for them? It helps to define the boundaries of the conflict – Devs vs !Devs without actually negating developers, since there are more similarities between these two groups than it seems at first.

The key misconception

“Devs create. IT-Pros don’t create”. This is true… in the wrongly built organizations – and there are way too many of them, in the reality. The problem comes from the fact that IT-Pros are usually understaffed. From a dumb manager’s perspective, IT-Pros need the headcount just enough to put down fires. This is a grave mistake – only after the fires are down, the REAL work starts. The upgrades and improvements to everything which was under fire and replacing it with something that won’t catch fire at all the next time. So, IT-Pros are creative, and they do create – when the company is smart enough to let them do so.

The world we live in

The world of modern computing adopted (and spoiled) the famous *aaS – as-a-Service abbreviation. We quickly move to the state of technology where everything could be offered as a service. DBs, compute resources, APIs, security tools, you name it. Which make Devs happy and threatens IT-Pros whos yesterdays tasks were just replaced by a new Azure/AWS/Google Cloud service. So how Devs and IT Pros respond to these changes?

The shift

The DevOps movement and its rapid adoption caused the famous shift to the left which basically empowered Developers with the tools which previously were managed by Operations. Simultaneously, Cloud has awakened and brought all power of quick deployments, testing in production and advanced telemetry to the developers. Devs became almighty and now can (almost) do their thing – write the code and never be bothered.

What happened to IT-Pros?

The shift hasn’t avoided the IT-Pro zone – I observe a similar change. Pros start writing their automation and deployment scripts, learning more efficient ways of doing yesterdays tasks by borrowing best methods of development and adopting them for IT-Pro work. They read the code and write own code. UX designers create mindblowing frameworks of design atoms and molecules, put them into source control and use Continuous Integration. There is no more distinction between Devs and IT-Pros based on writing the actual code.

Does it mean, there will be no IT-Pros eventually? Will Devs replace them?

No, not at all. If we look at the root of what Devs do and love doing for a living – it is not about spawning VMs to Azure. They consider it as a necessary evil or something that enables them. What they do is to create. It is like a painter who loves drawing but also has to go to IKEA, buy and assemble her easel. The core knowledge of Devs is the development itself – building complex distributed systems, efficient workflows, secure APIs – for the needs of modern world.

On the other hand, we have IT-Pros who in fact love deploying machines in Azure, and also know thousands of ways of doing it for hundreds of use cases. And now they also start to code and automate. What we get is a powerful combo that can build the virtual world for the products that Devs are writing.

Together, forever

It is obvious, they can’t survive without each other. The world will require more and more complex products – more secure, more resilient, more flexible. And while someone has to build it, others have to create architectures where these products can work at their best. It is not about deploying just a bunch of VMs to Azure and installing SQL server on them. It is about building an identity-controlled cloud with fully automated threat detection, where the product runs in a couple of dozen containers with replication to a bunch of regions, backup and data retention strategy.

And with ever-changing fluffy cloud landscape (it’s cloud, after all), new features become available weekly and sometimes completely change the game in one night. IT-Pros need to be aware of them before they become GA and have the adoption plan.

Continuous Integration

Cloud offers so many possibilities, all the cutting-edge tech is there up for grabs. But does your product architecture supports it. It should. But it doesn’t. The very common answer which causes months of refactoring, releasing. And – BAM – a newer and cooler tech is out there, and we’re back to the square one. Who could help them devs? IT-Pros! If a dev team integrates a Cloud Architect into their architecture meeting, they will be able to plan the future functionality of the product, target a specific cloud, align with its roadmap and get the best description of limited preview features that will be GA at the time of the product release.

Next steps

To adapt, IT-Pros need to become more efficient. Previously, Developers solved this issue for themselves by taking part of the Ops work and learning basics of what they did. It is time for IT-Pros to do the same to Devs. With the automation and coding skills, IT-Pros will be able to level up the complexity of cloud deployments and at the same time cut the time required for them.

To adapt, Devs need to integrate with IT-Pros when it comes to Cloud, Security, Design and make this integration continuous that starts from the design stages, goes through development testing and … actually lasts forever.

To adapt, organization management needs to staff IT-Pro teams properly and focus them on creating value instead of putting the fires down.

Identifying threats: Software inventory of Azure VMs

Azure VMs recently got a bunch of new features – Inventory, Change tracking and Update management (became GA on 8th of March, 2018). These features fill the gap in identifying the software that is deployed to IaaS clouds – the information necessary for securing these resources. The features are based on the Azure Automation capabilities and require an Automation account to run the workloads.

Inventory feature provides visibility into software installed on a VM (can be accessed from the individual VM blade), services, Linux daemons and also the timeline of events as part of the change the tracking view.

in2

VM view:

Inv1

In this example, we can see Adobe Flash Player and Steam client installed on the machine – both are increasing the attack surface for this infrastructure.

If you want to get more detail overview – proceed to Log Search tab. For example, this query will yield all non-Microsoft applications inventoried in the past day:

ConfigurationData
| where ConfigDataType == “Software”
| where SoftwareType == “Application”
| where Publisher != “Microsoft Corporation”
| order by TimeGenerated desc

logsearch1

The Change tracking view provides the visibility into the software changes and allows to track them efficiently. Also, you can add particular files or registry entries to watch. Watching entire folders is not yet supported for Windows VMs.

ch1.GIF

To get an overview of multiple machines either:

  • Click “Manage multiple machines” on Inventory or Change tracking view blade
    ch2
    This view also can be used to add non-Azure VMs through the Hybrid worker functionality.
  • Or go to the automation account that was associated with Inventory and Change tracking when first enabled. It also provides a convenient view of associated machines.

Update management can find machines with missing OS updates and fix it by scheduling a deployment.

upd1.PNG

Update management is integrated into the Security Center so that critical updates are never left out for all machines in the cloud. But it is important to remember to turn on these features first.

At the moment, only Update management allows taking an action based on the gathered data. Inventory doesn’t have a built-in functionality to remove unwanted applications or perform an on-demand scan. Also, it is lacking reporting capabilities and global overviews of applications in the cloud. Change tracking data can be used only for setting up custom alerts (requires Log Analytics knowledge).

At the same time, based on the direct connection of these features to the Azure Automation, I expect there will be more functions added to make it possible to fix found issues and secure a resource based on the Inventory data. It is clear that Microsoft is taking an important action in securing IaaS – and providing the data is only the first step.

P.S. For more advanced Inventory and Software Asset Management solutions, one may look into 3-rd party providers such as Snow Software (where I work at the moment).

 

Using ELK stack for vulnerability management

Have you tried to keep all data of discovered vulnerabilities in the same place? When you have about 10 different tools, and also manual records in task tracker, it becomes a bit of a nightmare.

My main needs were:

  1. Have all types of data in the same place – vulnerable computers, pentest findings, automated scan discoveries, software composition analysis etc.
  2. Have visibility over time in each of categories for each product – I need to talk to people with graphs and good visualizations to make my points.
  3. Have a birds-eye view on the state of security in products – to be able to act on the trend, before it materializes into the flaw or breach.
  4. Easily add new metrics.
  5. The answer to these questions needs to be free or inexpensive.

After weeks of thinking and looking for a tool that will save me, I figured out that the solution has always been just next to me. I simply spin up an instance of ELK (ElasticSearch, Logstash, Kibana) and set up a few simple rules with Logstash filters. They work as data hovers – by simply sucking in everything that is sent in as json payloads. Then, transformed data ends up in Elastic, and Kibana visualizes the vulnerabilities into trends and shows them in real-time dashboards. Also, it has a powerful search engine where I can find specific data in a matter of seconds.

I wanted to create a configuration that can process a generic json payload that was assembled by a script after parsing any vulnerability detection tool report.

Here is the example of logstash pipeline configuration:

input {
http { }
}

filter{
json{
source => “message”
}
}

filter {
date{
match => [ “date”, ISO8601 ]
}
}

output { elasticsearch {
hosts => localhost
index => “vuln-%{+YYYY.MM.dd}”
}
}

Here, we ingest http inputs posted to the logstash endpoint as json, set the @timestamp to the “date” field from the payload (it has to be submitted in ISO8601 format), and send this data to the ElasticSearch instance, to the index with name “vuln-<today’s date>”.

Let’s say we have 3 branches that are submitting open-source component analysis results on every build. Here is how the resulting report may look like (Y-axis represents discovered vulnerable components, X-axis represents scan dates, split in 3 sub-plots):

v1

Here, each of 3 subplots represents a separate branch analysis dynamics.

Or, for separate branches:

v2

And finally, we can build some nice data tables:

v3

This example only includes one case – OSS vulnerability reports. But various vulnerability data can be submitted to ELK, and then aggregated into powerful views for efficient management.

New way of managing on-prem Windows servers securely – Project Honolulu

Project Honolulu – a new tool that attaches UI to Powershell WMI capabilities for managing your servers securely.

I don’t have to explain why connecting with RDP to a remote server is a really really bad security practice. By default, Windows has no timeout on a disconnected RDP session. In fact, after you close your RDP session, your user (some kind of admin, right?) stays logged in to the server and God knows what happens when you don’t watch! For example, anyone who gains access to the same server as a non-privileged user can dump in-memory credentials and steal your remote session (i.e. with help of the infamous Mimikatz tool).

How to mitigate this problem? Don’t connect to servers with RDP. Ever! Microsoft believes that the solution is using WMI (Windows Management Instrumentation) via Powershell. At least, it protects you from those guys who wait in the server for you to log in to steal your credz. Sounds great but I want back my GUI, right?

Luckily, we’ve just got a tool for it – Project Honolulu. It executes PowerShell WMI commands in the backend and streamlines framework’s capabilities through the lean (and flat, of course!) UI. It allows you to perform most of the operations you would typically make in RDP. Hyper-V and Failover clusters are also supported.

Nowadays, the tool is in the Technical Preview but “it perfectly works in my environment” (c). Download it here.

Some screenshots from my Honolulu:

honolulu

honolulu2

It covers most common operations such as modifying firewall rules, local groups, checking logs, registry, resource utilization, installing new roles and features, and so much more!

Have fun!

Pandora FMS server with docker-compose

Docker-compose is amazing, this tool allows you to literally deploy complex clusters of containers with one command. Previously, I had a seamless experience running ELK (Elastic, Logstash, Kibana) with docker-compose and now decided to give a try to Pandora with it.

Pandora FMS is a great tool for monitoring and securing the infrastructure since it provides insights into anomalies that may happen to your servers. And it is open source and free!

To start with, I found 3 containers required to run the Pandora Server on the Docker Hub of Pandora FMS : MySQL DB instance initialized with Pandora DB, Pandora Server, and Pandora Console.

All the deployment magic happens within each container, and my task was only to create some infrastructure and orchestration for them with help of docker-compose.

I put the result on Github – pandora-docker-compose.

Here is a quick overview of  what it does:

  • Creates a dedicated network and assigns IPs to containers
  • Configures Postfix for sending emails to admins (int he default container it was not working)
  • Synchronizes time with Docker host
  • Maps Pandora DB files to local host folder so that you can back them up and restore

 

Integrating security into DevOps practices

DevOps as the cultural and technological shift in the software development has generated a huge space for improvements in the neighboring areas. To name one – Application security.

Since DevOps is embedded into every step of an idea on its way to the customer, it can be also used as the framework for driving security enhancements with the reduced costs – since automation and continuous delivery are built for the CI/CD needs. With a gentle security seasoning, an existing infrastructure will bring value to securing the product.

Where to start

As I said, we want security to affect all or most of those steps where DevOps transformation is already bringing value. Let’s just take a look at what we have on an abstract DevOps CD pipeline:

ci_insecure

It is a pretty straightforward deployment pipeline. It starts with requirements that are implemented into the code, which is covered with unit tests and built. The resulting artifact is deployed to staging where tested with automation, and also a code review takes place. When it is all done and succeeded, the change is merged to master, integration tests are running on the merge commit and artifacts are deployed to Production.

Secure CI/CD

Now, making no changes to the CD flow, we want to make the application more secure. Boxes in red are security features proposed to be added to the pipeline:

ci_secure

Security requirements

On the requirements planning stage (it can be a backlog grooming or sprint meeting), we instruct POs and engineers to analyze the security impact of the proposed feature and put mitigations/considerations into the task description. This step requires the team to understand the security profile of the application, the attacker profile and also have in place a classification of threats based on different factors (data exposure, endpoints exposure etc). This step requires some preliminary work to be done and is often ignored in the Agile environments. However, with a security embedded into the requirements, it becomes so much simpler for an engineer to actually fix possible issues before it gets exploited by an attacker. According to the famous calculation of the cost of fixing a possible failure, adding security to the design costs the less and brings most of the value.

In my experience, a separate field in the PBI or a dedicated section in the PBI template needs to be added to make sure the security requirements are not ignored.

Secure coding best practices

For an engineer who implements the feature, it is essential to have a reference how to make this or that particular security-related decision basing on a best practices reference document or guidance.  It can be a best-practices standard maintained by the company or the industry – but the team must agree on which particular practice/standard to follow. It should answer simple but obvious questions – for example, how to secure API? How to store password? When to use TLS?

Implementing this step brings consistency into secure side of the team’s coding. Also, it educates engineers and integrates best security coding practices into their routines forming a security-aware mindset.

Security-related Unit testing

This step assumes that we cover the highly risked functions and features of the code with unit tests first. It is important to maintain the tests fresh and increase coverage alongside with the ongoing development. One of the options is that for some risky features adding security unit tests is required for passing Code review.

Security-related Automated testing

In this step, the tests cover different scenarios of using/misusing the product. The goal is to make sure the security issues are addressed and verified with automation. Authorization, authentication, sensitive data exposure – to name a few areas to start with.

This set of tests needs to exist separately from the general test set providing visibility into the security testing coverage. Needs for implementing new automated security tests can be specified on the Requirements design stage and verified during Code review.

Static code analysis

This item doesn’t exist on the diagram but can also be mentioned. Security-related rules need to be enabled in the Static code analysis tool and be part of the quality gateway which determines whether a change is ready for production. There is a vast amount of different plugins and tools that allow performing automated analysis and fix what human eye may miss.

Security Code review

This code review needs to be done by a security-minded person or security champion from a specific AppSec team (if there is any). It is important to differ it from an ordinary CR and focus on the security impact and possible code flaws. Also, a person performing the review makes sure the Secure requirements are addressed, required unit/system tests are in place and the feature is good to go into the wild.

Security-related Automated testing

Similarly to automated test in the previous step with the only difference that here we test the system as the whole, after merging the change to the master.

Results

After all, we managed to reuse the existing process with adding a few key points related to security with clear rules and visible outcomes. DevOps is an amazing way helping us build a better product, and adding more improvements on this way on-the-go has never been easier.