Skip to main content

Custom Scraper

When you enable full: true, custom scrapers can return complex objects containing config data, changes, access logs, users, groups, and roles. This enables IAM scraping and compliance auditing.

For transforming scraped configs before they are saved (field exclusions, masking, relationships, etc.), see the Transformation Reference.

Top-Level Fields

With full: true, the scraper expects each item to have these top-level fields.

FieldSchemaDescription
configobjectThe actual configuration data to store
changes[]ChangeChange events
logs[]AccessLogAccess log entries
access[]ConfigAccessAccess permissions linking users/groups/roles to configs
users[]UserUser definitions from identity systems
groups[]GroupGroup definitions from identity systems
roles[]RoleRole definitions from identity systems
user_groups[]UserGroupUser-to-group membership mappings

Additional top-level fields for config identification:

FieldDescription
idUnique identifier for the config item
external_idExternal identifier for the config (used by access_logs and config_access to resolve config references)
config_type or typeConfig type (used alongside external_id for config reference resolution)
uuid or config_idUUID of the config item (used for direct config reference)

User

FieldDescriptionScheme
account_id*

Account identifier from the source system

string

name*

Display name of the user

string

aliases

Alternative identifiers used for alias resolution in config_access and access_logs (e.g. userPrincipalName, mail, onPremisesSamAccountName)

[]string

email

User's email address

string

user_type

Type of user: human, service, or system

string

Group

FieldDescriptionScheme
account_id*

Group identifier from the source system

string

name*

Group name

string

aliases

Alternative identifiers used for alias resolution in config_access

[]string

group_type

Type of group: team, role, department, or security

string

Role

FieldDescriptionScheme
account_id*

Role identifier from the source system

string

name*

Role name

string

aliases

Alternative identifiers used for alias resolution in config_access

[]string

description

Human-readable description of the role's purpose

string

role_type

Type of role: builtin, custom, database, or application

string

User Group

FieldDescriptionScheme
group*

Group name or alias to resolve against groups[].aliases

string or []string

user*

User name or alias to resolve against users[].aliases

string or []string

Config Access

The access array links users, groups, and roles to specific config items.

FieldDescriptionScheme
id*

Unique identifier for this access record

string

config_id

UUID or external ID of the config item

string

group

Group name or alias to resolve against groups[].aliases

string or []string

role

Role name or alias to resolve against roles[].aliases

string or []string

source

Source identifier for this access record

string

user

User name or alias to resolve against users[].aliases

string or []string

At least one of user, group, or role must be set.

Access Log

The logs array records individual access events.

FieldDescriptionScheme
config_id

UUID or external ID of the config item

string

count

Number of aggregated access events (default: 1)

integer

created_at

Timestamp when the access occurred

timestamp

mfa

Whether multi-factor authentication was used

boolean

properties

Additional access metadata (IP address, session info, client, etc.)

[map[string]string]

user

User name or alias to resolve against users[].aliases. If no matching user exists, one is auto-created.

string or []string

user must be set.

Alias Resolution

When processing access and logs entries, the scraper resolves aliases to actual entity IDs:

  1. user is matched against the aliases field of each users entry
  2. group is matched against the aliases field of each groups entry
  3. role is matched against the aliases field of each roles entry

If no matching entity is found, one is auto-created with the alias as its name.

Config Reference Resolution

Config items in access and logs can be referenced by config_id — either a UUID pointing directly to a config item, or a non-UUID string treated as an external ID.

If not provided, the top-level id/external_id/config_type fields are used as defaults.

Example

config-with-access.json
{
"id": "test-org-role-access",
"config": {
"name": "Test Organization",
"type": "organization"
},
"users": [
{
"name": "Charlie Brown",
"account_id": "org-789",
"user_type": "human",
"email": "charlie@example.com",
"aliases": ["charlie-brown", "charlie@example.com"]
}
],
"roles": [
{
"name": "Editor",
"account_id": "org-789",
"role_type": "custom",
"description": "Edit access",
"aliases": ["editor-role", "edit-access"]
}
],
"groups": [
{
"name": "Editors Group",
"account_id": "org-789",
"group_type": "team",
"aliases": ["editors-group", "edit-team"]
}
],
"access": [
{
"id": "role-access-001",
"external_config_id": {
"config_type": "Organization",
"external_id": "test-org-role-access"
},
"user": "charlie-brown",
"role": "editor-role",
"group": "editors-group"
}
],
"changes": [
{
"change_type": "permission_grant",
"summary": "Charlie granted editor access",
"created_at": "2025-01-08T10:00:00Z"
}
]
}

Extracting Changes & Access Logs

When you enable full: true, custom scrapers can ingest changes and access logs from external systems by separating the config data from change events in your source.