Microsoft Graph API
When the built-in Entra ID scraper doesn't cover your needs — custom $filter queries, specific user attributes, or endpoints not yet supported — you can use the HTTP scraper with Microsoft Graph API directly.
Prerequisites
- OAuth2 client credentials for your Entra ID app registration (tenant ID, client ID, client secret). You can supply these inline via
oauthandenvfields (shown below) or through a reusable Connection. - Application permissions below, admin-consented
| Permission | Purpose |
|---|---|
User.Read.All | Read user profiles and attributes |
Group.Read.All | Read group properties |
GroupMember.Read.All | Read group membership |
Directory.Read.All | Broad directory read access |
Application.Read.All | Read app registrations and enterprise apps |
Details
Create a new Azure App Registration
See the Entra ID integration guide for step-by-step Azure Portal instructions.
ScrapeConfig
The scraper below uses full: true so that each HTTP entry's CEL transform can emit reserved keys (groups, users, user_groups) that Mission Control processes as identity data rather than config items.
Both entries authenticate inline via OAuth2 client credentials. You can also reference a Connection instead of inlining the secrets.
ms-graph-scraper.yamlapiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
name: ms-graph-scraper
namespace: mc
spec:
schedule: "@every 1h"
full: true
http:
# ── Groups with members ──
- url: "https://graph.microsoft.com/v1.0/groups?$filter=startswith(displayName,'Flanksource')&$select=id,createdDateTime,displayName&$top=100&$expand=members($select=id,displayName,deletedDateTime,employeeId,mail,mailNickname,onPremisesDomainName,onPremisesSamAccountName,userPrincipalName)"
pagination:
nextPageExpr: '"@odata.nextLink" in response.body && response.body["@odata.nextLink"] != "" ? string(response.body["@odata.nextLink"]) : ""'
maxPages: 5
connection: connection://monitoring/azure-bearer
id: $.id
name: $.name
type: Azure::Group
transform:
expr: |
(dyn(config).map(group, {
'config_type': 'Azure::Group',
'id': group.id,
'name': group.displayName,
'config': {
'id': group.id,
'displayName': group.displayName,
'createdDateTime': group.createdDateTime,
},
'external_groups': [{
'name': group.displayName,
'account_id': group.id,
'group_type': 'security'
}],
'external_user_groups': has(group.members) ? group.members.filter(m, "@odata.type" in m && m["@odata.type"] == "#microsoft.graph.user" && m.?displayName.orValue('') != '').map(m, {
'external_user_id': m.id,
'external_group_id': group.id
}) : []
}) + [{
'config_type': 'Azure::Group',
'id': '__external_users__',
'name': '__external_users__',
'external_users':
dyn(config).map(g, has(g.members) ? g.members.filter(m,
"@odata.type" in m &&
m["@odata.type"] == "#microsoft.graph.user" &&
m.?displayName.orValue('') != ''
) : []).flatten()
.map(m, m.id).uniq()
.map(uid,
dyn(config).map(g, has(g.members) ? g.members.filter(m,
"@odata.type" in m &&
m["@odata.type"] == "#microsoft.graph.user" &&
m.id == uid
) : []).flatten()[0]
)
.map(m, {
'name': m.?displayName.orValue(''),
'account_id': m.id,
'email': m.?mail.orValue(''),
'user_type': 'human',
'aliases': [m.id, m.?mail.orValue(''), m.?userPrincipalName.orValue(''), m.?onPremisesSamAccountName.orValue(''), m.?mailNickname.orValue(''), m.?employeeId.orValue('')].filter(a, a != '').uniq()
})
}]).toJSON()
Adjust the $filter on the groups URL and $select on both entries to match the groups and attributes you need.
How Aliases Work
Every identity string added to the aliases array on a user is stored in the aliases column of the users table. When Mission Control processes an access record, it resolves user aliases by matching against these aliases.
This means if a database access log references a user by onPremisesSamAccountName (e.g. jdoe) and the Entra scraper stored that same string as an alias, Mission Control links the access record to the correct user — no manual mapping needed.
Store every known login identifier as an alias: id, mail, userPrincipalName, onPremisesSamAccountName, mailNickname, employeeId.
Common User Attributes
| Attribute | Description | Useful As Alias? |
|---|---|---|
id | Azure AD object ID (GUID) | Yes — stable, unique identifier |
mail | Primary email address | Yes — commonly used in access logs |
userPrincipalName | UPN (e.g. user@contoso.com) | Yes — default sign-in identifier |
onPremisesSamAccountName | On-prem AD sAMAccountName (e.g. jdoe) | Yes — matches database and VPN logs |
employeeId | HR system employee ID | Yes — matches HR and ITSM systems |
mailNickname | Exchange alias (e.g. jdoe) | Yes — matches legacy mail systems |
department | User's department | No — useful for filtering, not identification |
jobTitle | User's job title | No — useful for reporting |
Custom $filter Examples
The MS Graph API supports OData $filter queries to narrow results. Use these with the url parameter in the HTTP scraper.
Filter groups by display name prefix:
https://graph.microsoft.com/v1.0/groups?$filter=startsWith(displayName,'app-')&$select=id,displayName,description
Filter users by department:
https://graph.microsoft.com/v1.0/users?$filter=department eq 'Engineering'&$select=id,userPrincipalName,mail,department
List only enabled users:
https://graph.microsoft.com/v1.0/users?$filter=accountEnabled eq true&$select=id,userPrincipalName,mail,accountEnabled
Some advanced $filter queries (e.g. using endsWith, NOT, or property paths not indexed by default) require the ConsistencyLevel: eventual HTTP header and the $count=true query parameter. See the Microsoft Graph advanced query documentation for details.
Pagination
The HTTP scraper does not automatically follow @odata.nextLink pagination tokens. This means a single request returns at most the number of items specified by $top (maximum 999 for most MS Graph endpoints).
For collections with fewer than 999 items, set $top=999 to retrieve everything in one request.
For larger collections:
- Partition via
$filter— split requests by department, domain, or another attribute to keep each response under the limit - Use the built-in Entra scraper — the Azure scraper handles pagination automatically and is recommended for full-tenant scrapes
If you need all users or groups regardless of count, use the built-in Entra ID scraper instead of the HTTP scraper. It handles pagination, rate limiting, and incremental updates automatically.