Office 365 Management API Connector for ELK

I’ve been in the Info Sec sector for about a year and a half. While I’ve learned quite a lot in that time, I still know nothing, Jon Snow, and I feel like I am stumbling my way through everything, blindfolded. So I have decided to start publicly documenting some of the more useful things I am teaching myself, because I know there are other folks out there in the dark too!

If you’re like me, you’ve used the Office 365 Audit Log search and maybe you’ve even done queries via PowerShell. But if you want more proactive data rather than reactive, you’ll need to script something. Originally I had a scheduled script running to pull user login locations to a .csv file, but I figured I could get more bang for my buck by building an API connector and shipping all Azure Active Directory audit logs to my Elastic stack.

The fruits of my labor. A stupid little map with dots. Just kidding. I love it.

Great idea, self. But…. how? My only previous experience with building API connectors was an Akamai Open API connector that was heavily based on existing code. I did very little heavy lifting. So what you’ll find here is the result of a crap ton of googling, trial and error, and cursing. Let’s dive in.

First, you need to ensure Audit logs are enabled (if you can search the Audit Logs in the Security and Compliance center, you’re good to go).

Create an Azure App

This app will be the connection between whatever you use to make the API calls (in my case, a Python script) and the Office 365 Management API itself.

Login to https://portal.azure.com

Azure AD > App Registrations > New Application Registration

Give the app a name, choose Web/API, and enter any dummy URL for the Sign On URL since we will not be signing in to this app directly. Take note of your application’s ID. You will need this later. After saving your app, you will need to generate a key for API access. Click Settings > Keys.

Enter a description for the key and choose your desired duration. Click Save and your key will be shown to you. BE SURE TO COPY THIS because you will not be able to view the key once you leave the Key management blade.

Note: I did not use this key in my script, since I utilized a different method of authentication (this is covered in the next section). But it’s a good idea to go ahead and create a key in case you need one. Be sure to delete it later if you don’t need it.

Next you (or your tenant administrator) will need to give your app access to the Office 365 Management API. While still in the Settings blade, visit Required Permissions (under the API access section), click Add and chose Office 365 Management APIs. You will then be asked to choose permissions. Under Application Permissions, choose whichever data you want to be able to access via the APIs and Save.

According to Microsoft only the first three permissions are in use and the last 4 will be removed at some point.

If you have tenant admin rights, in the Required Permissions blade, select Office 365 Management APIs and then Grant Access. If not, you will need to have a tenant admin grant your app the permissions you just selected.

Create a Self-Signed Certificate and Add Data to Your App’s Manifest

In order to have near real-time access to the audit data, you’ll want an application that constantly polls the API. You’ll need to generate a self-signed certificate so that your app can request app-only access tokens without repeatedly requesting consent from the tenant admin after initial consent is granted. This is called Client Credentials Grant Flow.

There are many ways to generate certificates and extract the data you need for the manifest. Since I am a Linux user, I utilized openssl and python, but feel free to use whatever tool(s) you want (the Microsoft documentation details the process using Powershell).

Generate the certificates

openssl req -x509 -sha256 -nodes -days 730 -newkey rsa:2048 -keyout yourPrivate.pem -out yourPublic.cer

Compute the base64 thumbprint of the public certificate

$ openssl x509 -outform der -in yourPublic.cer | openssl dgst -binary -sha1 | openssl base64

Compute the base64 string of the public certificate contents

$ openssl x509 -outform der -in yourPublic.cer | openssl base64 -A

Generate a random uuid (also called guid in Windows)

$ python -c "import uuid; print(uuid.uuid4())"

Next, you’ll need to add this data to your app’s Manifest.

"keyCredentials": [
{
"customKeyIdentifier" : "BASE-64-THUMBPRINT",
"keyId": "UUID",
"type": "AsymmetricX509Cert",
"usage": "Verify",
"value": "BASE-64-STRING"
}
],

I didn’t trust myself to not bungle this up, so I downloaded the manifest, edited it locally, and uploaded it, rather than edit it right in the window.

Scripting the API Connector

Most of the documentation I found while developing this script was focused on PowerShell. As a full time Linux/OS X user, my PowerShell abilities are… lacking. As such, everything from this point will be pretty much Python 3 specific, though I’m sure if you have found yourself totally lost, it could still be helpful even if you are using a different scripting language. I refer to various mini scripts I used for testing — they are linked, and can be found, along with the production-ready version of the script, in my Office 365 Management API Connector repo on Github.

Requesting Access Tokens

After your Azure Web App has been granted initial access (when you or someone with tenant admin rights granted access), you will need to request a new access token for any API requests you make. Luckily Microsoft has an authentication library for Python called Azure Active Directory Authentication Library (ADAL). If you don’t have it already, install it with:

pip3 install adal

I wanted to first make sure I could get the access token before moving on to making API requests. Use this simple script to make a token request and view the response. Note: be sure your private key is in the same directory as your this script, or include the full path to it.

import adal, jsonclientid = "YOUR-APPLICATION-ID"
tenant_id = "YOUR-TENANT-ID"
resource = "https://manage.office.com"
private_key = open("YOUR-PRIVATE-KEY-FILENAME.pem", "r").read()
public_key_thumbprint = "YOUR-PUBLIC-KEY-THUMBPRINT"
context = adal.AuthenticationContext('https://login.microsoftonline.com/{}'.format(tenant_id))
token = context.acquire_token_with_client_certificate(
resource,
clientid,
private_key,
public_key_thumbprint)
print('Here is the token:')
print(json.dumps(token, indent=2))

(Find your Tenant ID; public key thumb print can be found under Keys in your Azure app’s settings)

Subscribe to feed

Using a simple script we can subscribe our tenant to which ever API content type feed we are interested in. For now, I am only subscribing to the Azure Active Directory content type. If you are subscribing to more than one content type, you must subscribe to each separately.

Subscription response

{'webhook': None, 'contentType': 'Audit.AzureActiveDirectory', 'status': 'enabled'}

Note: I am not using webhooks as I will be polling the API every five minutes and sending data to Logstash via TCP.

Testing

Since I’m still learning Python, I like to do lots of tests as I’m writing code to make sure I don’t get too far in and find myself with a problem I can’t pinpoint. I’ve published a few of the testing scripts in the repo and I’ll quickly go over my process here. But if you are just interested in the full script, skip ahead to Prepping the Script for Production, or head straight over to the Github repo.

First, I tested that I was able to properly pull events.

I used the start and end time parameters to limit the number of responses; I requested one minute of content during off peak hours.

Times are in UTC

After ensuring that I was able to pull events, I then tested sending these same events to my Logstash server:

  • test_tcp.py
  • o365_api.conf (don’t forget to add a config file to Logstash to enable listening on the port you specify and to enable shipping your logs to Elasticsearch)

The last thing I tested was pulling the NextPage uri, which required adding a while loop to the script. I also created a function for the bulk of the processing to keep the script sort of clean.

Prepping the Script for Production

Logging and Errors

In the production version of the script I am using python’s super simple logging module to catch and log the response for any non-200 status codes for my API calls. You will notice in the script that I also print this information to stdout; I am using Rundeck to schedule my script, and I find it useful to see the details of any errors for each run of the script.

Rundeck Job Activity: I can easily view failed runs of my script and view the errors printed to stdout.

I also added a bit of logic to account for and log anytime a null access token is received. In the near future I want to perhaps add a function that will send me a simple email if a null access token is received since this could signal an issue with my credentials and I would probably want to get that sorted out ASAP.

Overlapping Queries

I added datetime values for my time variables and I have my queries overlapping by roughly 5 minutes (polling the API every 5 minutes for 10 mins worth of available content). This is to account for “API drift”, since the events are not necessarily sent immediately, nor are they sent in strict sequential order. I will likely be refining this script through trial and error and could increase the overlap if necessary.

To account for any duplicate events that appear due to the overlapping queries, I considered using a file to hold the event ids from each run of the script, and using logic to drop any events that appeared in the previous run of the script. However, I decided that Logstash is the better place to handle this — I instead set the “id” field to be used as the document id in Elasticsearch to prevent any duplication.

Elasticsearch Index Mapping

Depending on how diligent your company is about Elasticsearch indexing, you may want to define the mapping for the fields in your events. Generally, I utilize the default mapping because I’m lazy and my data is usually simple. However, since the original reason I was even interested in this API is user login locations (and the sweet, sweet maps I can create), I needed to ensure that the GeoIP information was parsed correctly. Luckily, I had previously edited our default index template.

PUT /_template/log
{
"index_patterns" : ["log*"],
"mappings": {
"doc": {
"properties": {
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"latitude": {
"type": "half_float"
},
"longitude": {
"type": "half_float"
}
}
}
}
}
}
}

I did eventually create a full mapping for the Azure Active Directory content type. If you plan on polling other content types like Exchange or Sharepoint, you’ll need to modify the mappings some.

As I mentioned earlier, I am using Rundeck (and a Docker container) to run my script in production. I won’t go into any detail about that here since I am starting to go cross-eyed from writing this, but feel free to email me if you want to know more about how I’m using Rundeck.

Improving the API Connector

As time goes on, I would like to refine the script’s logging and error handling and I will likely edit either the interval the script runs and/or the window of time of my queries to ensure I am getting all of the audit events. I’ll be sure to update the github repo if I ever get to that. I will probably never get to that.

And since we’re talking about things I’ll probably never do, I think a cool extension of this script would be exploring Microsoft Graph to use the “Application ID” field to lookup the display name of the application that is requesting the user’s login.

And that’s it. For now. Be sure to check out the github repo! I’m tired.

InfoSec practitioner. I don't know anything but I'm trying!