Microsoft Graph API

When the built-in Entra ID scraper doesn't cover your needs — custom $filter queries, specific user attributes, or endpoints not yet supported — you can use the HTTP scraper with Microsoft Graph API directly.

Prerequisites

OAuth2 client credentials for your Entra ID app registration (tenant ID, client ID, client secret). You can supply these inline via oauth and env fields (shown below) or through a reusable Connection.
Application permissions below, admin-consented

Permission	Purpose
`User.Read.All`	Read user profiles and attributes
`Group.Read.All`	Read group properties
`GroupMember.Read.All`	Read group membership
`Directory.Read.All`	Broad directory read access
`Application.Read.All`	Read app registrations and enterprise apps

Details

Create a new Azure App Registration

See the Entra ID integration guide for step-by-step Azure Portal instructions.

ScrapeConfig

The scraper below uses full: true so that each HTTP entry's CEL transform can emit reserved keys (groups, users, user_groups) that Mission Control processes as identity data rather than config items.

Both entries authenticate inline via OAuth2 client credentials. You can also reference a Connection instead of inlining the secrets.

ms-graph-scraper.yaml
apiVersion: configs.flanksource.com/v1
kind: ScrapeConfig
metadata:
  name: ms-graph-scraper
  namespace: mc
spec:
  schedule: "@every 1h"
  full: true
  http:
    # ── Groups with members ──
    - url: "https://graph.microsoft.com/v1.0/groups?$filter=startswith(displayName,'Flanksource')&$select=id,createdDateTime,displayName&$top=100&$expand=members($select=id,displayName,deletedDateTime,employeeId,mail,mailNickname,onPremisesDomainName,onPremisesSamAccountName,userPrincipalName)"
      pagination:
        nextPageExpr: '"@odata.nextLink" in response.body && response.body["@odata.nextLink"] != "" ? string(response.body["@odata.nextLink"]) : ""'
        maxPages: 5
      connection: connection://monitoring/azure-bearer
      id: $.id
      name: $.name
      type: Azure::Group
      transform:
        expr: |
          (dyn(config).map(group, {
            'config_type': 'Azure::Group',
            'id': group.id,
            'name': group.displayName,
            'config': {
              'id': group.id,
              'displayName': group.displayName,
              'createdDateTime': group.createdDateTime,
            },
            'external_groups': [{
              'name': group.displayName,
              'account_id': group.id,
              'group_type': 'security'
            }],
            'external_user_groups': has(group.members) ? group.members.filter(m, "@odata.type" in m && m["@odata.type"] == "#microsoft.graph.user" && m.?displayName.orValue('') != '').map(m, {
              'external_user_id': m.id,
              'external_group_id': group.id
            }) : []
          }) + [{
            'config_type': 'Azure::Group',
            'id': '__external_users__',
            'name': '__external_users__',
            'external_users':
              dyn(config).map(g, has(g.members) ? g.members.filter(m,
                "@odata.type" in m &&
                m["@odata.type"] == "#microsoft.graph.user" &&
                m.?displayName.orValue('') != ''
              ) : []).flatten()
              .map(m, m.id).uniq()
              .map(uid,
                dyn(config).map(g, has(g.members) ? g.members.filter(m,
                  "@odata.type" in m &&
                  m["@odata.type"] == "#microsoft.graph.user" &&
                  m.id == uid
                ) : []).flatten()[0]
              )
              .map(m, {
                'name': m.?displayName.orValue(''),
                'account_id': m.id,
                'email': m.?mail.orValue(''),
                'user_type': 'human',
                'aliases': [m.id, m.?mail.orValue(''), m.?userPrincipalName.orValue(''), m.?onPremisesSamAccountName.orValue(''), m.?mailNickname.orValue(''), m.?employeeId.orValue('')].filter(a, a != '').uniq()
              })
          }]).toJSON()

Adjust the $filter on the groups URL and $select on both entries to match the groups and attributes you need.

How Aliases Work

Every identity string added to the aliases array on a user is stored in the aliases column of the users table. When Mission Control processes an access record, it resolves user aliases by matching against these aliases.

This means if a database access log references a user by onPremisesSamAccountName (e.g. jdoe) and the Entra scraper stored that same string as an alias, Mission Control links the access record to the correct user — no manual mapping needed.

Store every known login identifier as an alias: id, mail, userPrincipalName, onPremisesSamAccountName, mailNickname, employeeId.

Common User Attributes

Attribute	Description	Useful As Alias?
`id`	Azure AD object ID (GUID)	Yes — stable, unique identifier
`mail`	Primary email address	Yes — commonly used in access logs
`userPrincipalName`	UPN (e.g. `user@contoso.com`)	Yes — default sign-in identifier
`onPremisesSamAccountName`	On-prem AD sAMAccountName (e.g. `jdoe`)	Yes — matches database and VPN logs
`employeeId`	HR system employee ID	Yes — matches HR and ITSM systems
`mailNickname`	Exchange alias (e.g. `jdoe`)	Yes — matches legacy mail systems
`department`	User's department	No — useful for filtering, not identification
`jobTitle`	User's job title	No — useful for reporting

Custom `$filter` Examples

The MS Graph API supports OData $filter queries to narrow results. Use these with the url parameter in the HTTP scraper.

Filter groups by display name prefix:

https://graph.microsoft.com/v1.0/groups?$filter=startsWith(displayName,'app-')&$select=id,displayName,description

Filter users by department:

https://graph.microsoft.com/v1.0/users?$filter=department eq 'Engineering'&$select=id,userPrincipalName,mail,department

List only enabled users:

https://graph.microsoft.com/v1.0/users?$filter=accountEnabled eq true&$select=id,userPrincipalName,mail,accountEnabled

info

Some advanced $filter queries (e.g. using endsWith, NOT, or property paths not indexed by default) require the ConsistencyLevel: eventual HTTP header and the $count=true query parameter. See the Microsoft Graph advanced query documentation for details.

Pagination

The HTTP scraper does not automatically follow @odata.nextLink pagination tokens. This means a single request returns at most the number of items specified by $top (maximum 999 for most MS Graph endpoints).

For collections with fewer than 999 items, set $top=999 to retrieve everything in one request.

For larger collections:

Partition via $filter — split requests by department, domain, or another attribute to keep each response under the limit
Use the built-in Entra scraper — the Azure scraper handles pagination automatically and is recommended for full-tenant scrapes

tip

If you need all users or groups regardless of count, use the built-in Entra ID scraper instead of the HTTP scraper. It handles pagination, rate limiting, and incremental updates automatically.

Next Steps

Config Access Reference

Prerequisites​

ScrapeConfig​

How Aliases Work​

Common User Attributes​

Custom $filter Examples​

Pagination​

Next Steps​