Skip to main content

Microsoft Graph API

When the built-in Entra ID scraper doesn't cover your needs — custom $filter queries, specific user attributes, or endpoints not yet supported — you can use the HTTP scraper with Microsoft Graph API directly.

Prerequisites

  • OAuth2 client credentials for your Entra ID app registration (tenant ID, client ID, client secret). You can supply these inline via oauth and env fields (shown below) or through a reusable Connection.
  • Application permissions below, admin-consented
PermissionPurpose
User.Read.AllRead user profiles and attributes
Group.Read.AllRead group properties
GroupMember.Read.AllRead group membership
Directory.Read.AllBroad directory read access
Application.Read.AllRead app registrations and enterprise apps
Details
Create a new Azure App Registration

See the Entra ID integration guide for step-by-step Azure Portal instructions.

ScrapeConfig

The scraper below uses full: true so that each HTTP entry's CEL transform can emit reserved keys (groups, users, user_groups) that Mission Control processes as identity data rather than config items.

Both entries authenticate inline via OAuth2 client credentials. You can also reference a Connection instead of inlining the secrets.

ms-graph-scraper.yaml
apiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
name: ms-graph-scraper
namespace: mc
spec:
schedule: "@every 1h"
full: true
http:
# ── Groups with members ──
- url: "https://graph.microsoft.com/v1.0/groups?$filter=startswith(displayName,'Flanksource')&$select=id,createdDateTime,displayName&$top=100&$expand=members($select=id,displayName,deletedDateTime,employeeId,mail,mailNickname,onPremisesDomainName,onPremisesSamAccountName,userPrincipalName)"
pagination:
nextPageExpr: '"@odata.nextLink" in response.body && response.body["@odata.nextLink"] != "" ? string(response.body["@odata.nextLink"]) : ""'
maxPages: 5
connection: connection://monitoring/azure-bearer
id: $.id
name: $.name
type: Azure::Group
transform:
expr: |
(dyn(config).map(group, {
'config_type': 'Azure::Group',
'id': group.id,
'name': group.displayName,
'config': {
'id': group.id,
'displayName': group.displayName,
'createdDateTime': group.createdDateTime,
},
'external_groups': [{
'name': group.displayName,
'account_id': group.id,
'group_type': 'security'
}],
'external_user_groups': has(group.members) ? group.members.filter(m, "@odata.type" in m && m["@odata.type"] == "#microsoft.graph.user" && m.?displayName.orValue('') != '').map(m, {
'external_user_id': m.id,
'external_group_id': group.id
}) : []
}) + [{
'config_type': 'Azure::Group',
'id': '__external_users__',
'name': '__external_users__',
'external_users':
dyn(config).map(g, has(g.members) ? g.members.filter(m,
"@odata.type" in m &&
m["@odata.type"] == "#microsoft.graph.user" &&
m.?displayName.orValue('') != ''
) : []).flatten()
.map(m, m.id).uniq()
.map(uid,
dyn(config).map(g, has(g.members) ? g.members.filter(m,
"@odata.type" in m &&
m["@odata.type"] == "#microsoft.graph.user" &&
m.id == uid
) : []).flatten()[0]
)
.map(m, {
'name': m.?displayName.orValue(''),
'account_id': m.id,
'email': m.?mail.orValue(''),
'user_type': 'human',
'aliases': [m.id, m.?mail.orValue(''), m.?userPrincipalName.orValue(''), m.?onPremisesSamAccountName.orValue(''), m.?mailNickname.orValue(''), m.?employeeId.orValue('')].filter(a, a != '').uniq()
})
}]).toJSON()

Adjust the $filter on the groups URL and $select on both entries to match the groups and attributes you need.

How Aliases Work

Every identity string added to the aliases array on a user is stored in the aliases column of the users table. When Mission Control processes an access record, it resolves user aliases by matching against these aliases.

This means if a database access log references a user by onPremisesSamAccountName (e.g. jdoe) and the Entra scraper stored that same string as an alias, Mission Control links the access record to the correct user — no manual mapping needed.

Store every known login identifier as an alias: id, mail, userPrincipalName, onPremisesSamAccountName, mailNickname, employeeId.

Common User Attributes

AttributeDescriptionUseful As Alias?
idAzure AD object ID (GUID)Yes — stable, unique identifier
mailPrimary email addressYes — commonly used in access logs
userPrincipalNameUPN (e.g. user@contoso.com)Yes — default sign-in identifier
onPremisesSamAccountNameOn-prem AD sAMAccountName (e.g. jdoe)Yes — matches database and VPN logs
employeeIdHR system employee IDYes — matches HR and ITSM systems
mailNicknameExchange alias (e.g. jdoe)Yes — matches legacy mail systems
departmentUser's departmentNo — useful for filtering, not identification
jobTitleUser's job titleNo — useful for reporting

Custom $filter Examples

The MS Graph API supports OData $filter queries to narrow results. Use these with the url parameter in the HTTP scraper.

Filter groups by display name prefix:

https://graph.microsoft.com/v1.0/groups?$filter=startsWith(displayName,'app-')&$select=id,displayName,description

Filter users by department:

https://graph.microsoft.com/v1.0/users?$filter=department eq 'Engineering'&$select=id,userPrincipalName,mail,department

List only enabled users:

https://graph.microsoft.com/v1.0/users?$filter=accountEnabled eq true&$select=id,userPrincipalName,mail,accountEnabled
info

Some advanced $filter queries (e.g. using endsWith, NOT, or property paths not indexed by default) require the ConsistencyLevel: eventual HTTP header and the $count=true query parameter. See the Microsoft Graph advanced query documentation for details.

Pagination

The HTTP scraper does not automatically follow @odata.nextLink pagination tokens. This means a single request returns at most the number of items specified by $top (maximum 999 for most MS Graph endpoints).

For collections with fewer than 999 items, set $top=999 to retrieve everything in one request.

For larger collections:

  • Partition via $filter — split requests by department, domain, or another attribute to keep each response under the limit
  • Use the built-in Entra scraper — the Azure scraper handles pagination automatically and is recommended for full-tenant scrapes
tip

If you need all users or groups regardless of count, use the built-in Entra ID scraper instead of the HTTP scraper. It handles pagination, rate limiting, and incremental updates automatically.

Next Steps