Use Case Demo: Fields Quality (CIM and non-CIM)
Use Case Demo: Fields Quality
This white paper describes a new concept for performing continuous fields quality assessment on Splunk data.
This leverages different scalable Splunk techniques and TrackMe components to perform this task.
Fields quality assessment is a crucial aspect of Splunk monitoring, as it helps ensure that the data is accurate and consistent, ready to serve use cases.
By implementing this, you will be able to build a solid, robust, and scalable solution to monitor the quality of fields parsing for your various sourcetypes in Splunk, as well as getting automated alerts when fields quality issues are detected.
These concepts are applicable to both Splunk Common Information Model (CIM) and non-CIM data.
Finally, we leverage a TrackMe restricted component called Flex Objects (splk-flx) to perform the continuous fields quality assessment, although parts of this logic are available to TrackMe Community Edition users.
TrackMe version 2.1.18 and later is required to leverage this feature.
This work was made possible thanks to the support and deep collaboration of a major fellow TrackMe customer, thank you!
High level workflow and diagram
The following diagram shows the high level workflow for fields quality assessment:

From a high level perspective, the workflow is as follows:
Step 1: Collect
The user defines a set of Splunk scheduled searches that leverage Splunk Sampling to sample data from the CIM data models or events of their choice.
These searches call the streaming TrackMe backend
trackmefieldsquality
which performs the assessment of the fields quality: - Using one of the different methods supported by the command, define the fields of interest to monitor. - The command verifies for common issues: missing, empty or null, equal to unknown - The command can also check the content of the fields using a submitted regular expression as part of a model provided in input - The command generates a JSON object with the results of the assessment, as well as the global summary of the assessment for the sampled event - Metadata are stored in the JSON object (index, sourcetype, etc.) which can also be extended as per the user needsThe search finally calls the Splunk
collect
command to index the JSON results using the TrackMe sourcetypetrackme:fields_quality
Step 2: Monitor & Alert
A TrackMe Virtual Tenant is created and enables the TrackMe Flex Objects (splk-flx) component.
A Flex Object tracker is created which consumes the resulting JSON events, defines the associated entities and track the quality of the sampling over time.
Thresholds can be defined for each entity using TrackMe capabilities, to finally generate automated alerts when the percentages of compliance go beyond the defined thresholds.
Phase 1: Collect
The primary step is to define what needs to be monitored depending on your needs and objectives.
The collect phase is highly flexible and scalable, in short the concept is the following:
Create a set of scheduled searches that you will execute on a regular basis, for instance once per day during night off-peak hours.
Each search will use the Splunk sampling feature, which allows randomly selecting a certain number of subset of events from the scope of the search.
It then calls the
trackmefieldsquality
command with different parameters, which performs the assessment of the fields quality and generates a JSON object per event.Finally, the search will call the
collect
command to index the JSON results using the TrackMe sourcetypetrackme:fields_quality
.
About the Common Information Model (CIM)
If your objective is to monitor the quality of your CIM parsing, from the lens of the CIM data models, you will likely want to have at least one search per CIM data model and node.
Example: - 1 search for the
Web
datamodel - 1 search for theNetwork_Traffic
datamodel - 1 search for theMalware
datamodel - 1 search for theEndpoints.Process
datamodel - etc.
Hint
About TrackMe system level sharing
By default, TrackMe shares its content included the command
trackmefieldsquality
at the application level only.This means that you cannot execute this command out of the TrackMe application unless you share TrackMe at the system level.
You totally can update your system configuration to change this behavior, go in Splunk Web, navigate to
Manage Apps
, and updatepermissions
on TrackMe so thatApply selected role permissions to
is set toAll apps (system)
.
Example: Web CIM data model
In this example, we will define the scheduled search to monitor the Web
CIM data model.
The code of the scheduled search is the following:
| datamodel Web Web flat strict_fields=false summariesonly=t
``` according to your needs, add filters to the search such as indexes filters, etc```
| search index=webserver
``` custom metadata to identify the datamodel ```
| eval datamodel="Web", nodename="Web"
``` call the backend ```
| trackmefieldsquality fields_to_check_list="action,app,bytes,bytes_in,bytes_out,category,dest,src,http_method,http_referrer,http_user_agent,src,url" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True
``` call collect ```
| collect index=summary sourcetype=trackme:fields_quality
When creating the schedule search, perform some scaling validation and define the sampling rate to use:

Explanation: datamodel command
Consult the Splunk documentation for the datamodel command for more information about its usage.
In short:
Define the data model and the node name, here we call the
Web
data model and theWeb
node.The
flat
option returns the fields without the datamodel node prefix, for instanceWeb.action
becomesaction
.The option
strict_fields=false
allows to return all fields.The option
summariesonly=t
allows to use accelerated data only for scaling purposes.
In our example, we therefore define the call to the datamodel and the scope of the search:
| datamodel Web Web flat strict_fields=false summariesonly=t
``` according to your needs, add filters to the search such as indexes filters, etc```
| search index=webserver
Depending on your use cases and preferences, one search can suffice to address all your needs, or you may want or prefer to implement several searches specialized per scope, for instance one search per sourcetype, or vendor etc.
Explanation: define the metadata fields
In our search, we define the following:
| eval datamodel="Web", nodename="Web"
This is nothing more than a way for us to define the name of the data model we are monitoring, which information will be leveraged when calling the trackmefieldsquality
command.
Explanation: trackmefieldsquality command
This where the real and important part of the work is done.
A resulting JSON example is the following: (we will explain this in the next sections!)
{
"time": 1747256773,
"action": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"app": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"bytes": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"bytes_in": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"dest": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"src": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"http_method": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"http_referrer": {
"status": "failure",
"description": "Field is 'unknown'.",
"is_missing": false,
"is_empty": false,
"is_unknown": true,
"regex_failure": false
},
"http_user_agent": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"url": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false
},
"summary": {
"overall_status": "failure",
"total_fields_checked": 12,
"total_fields_failed": 1,
"total_fields_passed": 11,
"percentage_failed": 8.33,
"percentage_passed": 91.67
},
"metadata": {
"time_epoch": 1747256773,
"time_human": "Wed May 14 21:06:13 2025 UTC",
"index": "webserver",
"sourcetype": "nginx:plus:kv",
"host": "uk1.trackme",
"source": "/var/log/nginx/access.log",
"datamodel": "Web"
},
"event_id": "f0d2a9ba805f92a6c045a7085441386170de2b0d1807968837e926d3b39faceb"
}
The command accepts the following parameters:
argument |
description |
default |
example or valid values |
---|---|---|---|
|
The list of fields to verify, provided as a comma-separated list |
None |
“action,app,bytes,url” |
|
Name of the field containing the list of fields to check (comma-separated) |
None |
“fields_list” |
|
JSON string containing a dictionary of fields to check with optional regex patterns |
None |
‘{“field1”: {“name”: “field1”, “regex”: “^[A-Z]+$”}, “field2”: {“name”: “field2”}}’ |
|
Path to a JSON file containing a dictionary of fields to check with optional regex patterns |
None |
“/opt/splunk/etc/apps/myapp/mydir/web_datamodel.json” |
|
Name of the field containing a JSON string with a dictionary of fields to check |
None |
fields_dict |
|
Boolean option to include field values in the JSON summary |
False |
True/False |
|
Boolean option to pretty print the JSON summary |
True |
True/False |
|
The mode to output the results (json or raw) |
json |
json |
|
CSV list of metadata fields to include in the metadata section of the JSON |
index,sourcetype,host,source |
“datamodel” |
|
Defines the name of the summary field |
summary |
“summary” |
|
Defines the name of the metadata field added to the summary JSON |
metadata |
“metadata” |
The first 5 options are mutually exclusive, only one of these options can be used at a time:
fields_to_check_list
fields_to_check_fieldname
fields_to_check_dict
fields_to_check_dict_path
fields_to_check_dict_fieldname
These options exist so that we can cover all use cases, this provides all levels of flexibility to use different Splunk techniques such as subsearches or simply storing the list of fields or models, dynamic generation in SPL, etc.
Let’s take an example over each of these options:
Argument: fields_to_check_list
This is the most simple use case:
Provide the list of fields to be checked as a comma-separated list
For each field, we will check the following: - Missing - Empty - Null - Equal to unknown
If the field passes all these checks, it is declared as valid with a status of success
, and failure
otherwise.
The command will store with the JSON object a section for the field, which includes flags
for each check, with a boolean value True or False:
is_missing
is_empty
is_unknown
regex_failure
In addition, the command will account the field and its status in the summary
section of the JSON object:
overall_status
: the overall status of the field checks, eithersuccess
orfailure
total_fields_checked
: the total number of fields checkedtotal_fields_failed
: the total number of fields that failedtotal_fields_passed
: the total number of fields that passedpercentage_failed
: the percentage of fields that failedpercentage_passed
: the percentage of fields that passed
Note: this argument does NOT process a regular expression to check for the content, therefore the regex_failure
flag will always be False
. (see next options for this)
Argument: fields_to_check_fieldname
This does exactly the same as fields_to_check_list
, but instead of providing the list of fields to check as a comma-separated list, we provide the name of a field that contains the list of fields to check.
You would therefore call the command as follows:
Example:
| eval fields_list="action,app,bytes,url"
| trackmefieldsquality fields_to_check_fieldname="fields_list"
The point of having this option is that could for instance use a Splunk subsearch to generate the list of fields dynamically, for instance by accessing a lookup table where you store the fields depending on your criteria, or any other solution of your choice.
Hint
Providing a JSON dictionary model
The next 3 options allow to provide a JSON dictionary model that models the fields to check, as well as optional parameters for each field.
Especially, you can define a regular expression with the field
regex
to be apply against the value, allowing to valid the content of the field according to any needs.You can also define the field
allow_unknown
to beTrue
orFalse
, which can be used to disable the check for the fieldis_unknown
.
Argument: fields_to_check_dict
This option is more sophisticated and allows to define a dictionnary that models the fields to check, and as well as an option regular expression to be apply against the value.
For instance, the following distionnary would verify the fields bytes
including the fact that this should be a numerical value using a regular expression, the field action
that for instance would accept only success
or failure
, and finally the field http_referrer
where we would only perform the basic checks without verifying that its value matches certain criteria.
| trackmefieldsquality fields_to_check_dict="{\"bytes\": {\"name\": \"bytes\", \"regex\": \"^\\\d*\"}, \"action\": {\"name\": \"action\", \"regex\": \"^(success|failure)$\"}, \"http_referrer\": {\"name\": \"http_referrer\"}}" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True
In this case, if a regrex expression is provided, the regex_failure
flag will be set to True
if the value does not match the regular expression, and False
otherwise, which accounts for status of the field in addition to the other checks.
Example of JSON output:
{
"time": 1747261746,
"bytes": {
"status": "success",
"description": "Field exists and is valid.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": false,
"value": "309"
},
"action": {
"status": "failure",
"description": "Field exists but value does not match the required pattern.",
"is_missing": false,
"is_empty": false,
"is_unknown": false,
"regex_failure": true,
"value": "Bad Request"
},
"http_referrer": {
"status": "failure",
"description": "Field is 'unknown'.",
"is_missing": false,
"is_empty": false,
"is_unknown": true,
"regex_failure": false,
"value": "unknown"
},
"summary": {
"overall_status": "failure",
"total_fields_checked": 3,
"total_fields_failed": 2,
"total_fields_passed": 1,
"percentage_failed": 66.67,
"percentage_passed": 33.33
},
"metadata": {
"time_epoch": 1747261746,
"time_human": "Wed May 14 22:29:06 2025 UTC",
"index": "webserver",
"sourcetype": "nginx:plus:kv",
"host": "trackme-solutions.com",
"source": "/var/log/nginx/access.log",
"datamodel": "Web"
},
"event_id": "f5ed1437ee8486a3782ebaea846dad37c52d47b825d1913c1ae7d085ba01f943"
}
Hint
Escaping backslashes and special characters
The tricky part is that you need to pay attention to the JSON provided as input to the command
Especially, double quotes within the JSON string need to be escaped.
The regular expression also needs to be escaped, for instance
^\\\d*
which would otherwise be\d*
in normal circumstances.
Argument: fields_to_check_dict_fieldname
Similarly to fields_to_check_fieldname
, this option allows to provide the name of a field that contains the dictionary of fields to check as in the previous example.
This allows to use a Splunk subsearch to generate the dictionary dynamically, for instance by accessing a lookup table where you store the fields depending on your criteria, or any other solution of your choice.
Example:
| eval fields_dict="{\"bytes\": {\"name\": \"bytes\", \"regex\": \"^\\\d*\"}, \"action\": {\"name\": \"action\", \"regex\": \"^(success|failure)$\"}, \"http_referrer\": {\"name\": \"http_referrer\"}}"
| trackmefieldsquality fields_to_check_dict_fieldname="fields_dict" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True
Argument: fields_to_check_dict_path
This option is similar to fields_to_check_dict_fieldname
, but instead of providing the dictionary as a JSON string, we provide the path to a JSON file that contains the dictionary.
The file must exist on the file system of the Splunk instance, and the path must be provided as a string.
Our JSON file would look like this:
{
"action": {
"name": "action",
"regex": "^(success|failure)$"
},
"bytes": {
"name": "bytes",
"regex": "^\\d*"
},
"http_referrer": {
"name": "http_referrer"
}
}
Example:
| trackmefieldsquality fields_to_check_dict_path="/opt/splunk/etc/apps/myapp/mydir/web_datamodel.json" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True
Argument: include_field_values
This option allows to include the field values in the JSON output, which can useful for analytics and reporting purposes.

Argument: pretty_print_json
This option allows to pretty print the JSON output, which can be useful for debugging purposes.


Argument: output_mode
This option allows to specify the output mode, which can be json
or raw
.
We recommend using the json
mode for most use cases, especially this document is focused on the JSON output which is leveraged and indexed using the TrackMe sourcetype trackme:fields_quality
.
output_mode=json
In JSON mode, the command verifies the fields and generates a JSON object per event:



output_mode=raw
In raw mode, the command generates the events as they are, and adds a field called json_summary
which contains our JSON object:


Argument: metadata_fields
This option allows to specify the metadata fields as a comma-separated list of fieldsto include in the JSON output.
The metadata fields always include the following:
index
: the index of the eventsourcetype
: the sourcetype of the eventhost
: the host of the eventsource
: the source of the event
By defining the metadata_fields
parameter, you can add additional fields to the JSON output, for instance the datamodel
field which in our implementation is used to identify the data model of the event.
In our example, we are defining a field called datamodel
using an eval which will be added to the JSON output:
| eval datamodel="Web"
| trackmefieldsquality fields_to_check_list="action,app,bytes,url" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True

Argument: summary_fieldname
This option allows to specify the name of the summary field in the JSON output, this defaults to summary
but can be customised if needed, for instance if there is a conflict with a field from the data model or events.
In this example, instead of summary
, we use quality_summary
:
| trackmefieldsquality fields_to_check_list="action,app,bytes,url" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True summary_fieldname="quality_summary"

Argument: metadata_fieldname
This option allows to specify the name of the metadata field in the JSON output, this defaults to metadata
but can be customised if needed, for instance if there is a conflict with a field from the data model or events.
In this example, instead of metadata
, we use quality_metadata
:
| trackmefieldsquality fields_to_check_list="action,app,bytes,url" pretty_print_json=False output_mode=json metadata_fields="datamodel" include_field_values=True metadata_fieldname="quality_metadata"

Centrally managing fields quality in a lookup table: simple example defining the list of fields to check
In this example, we will define a lookup table that contains the list of fields to check for each data model and node name, you can of course adapt this example and go beyond this such as specialising the fields to check with sourcetype, etc.
Our lookup table has 3 fields:
datamodel
: the data model of the eventnodename
: the node name of the eventfields_to_check
: the list of fields to check for the event, separated by commas
Our lookup table looks like this:

When calling the command trackmefieldsquality
, we will use the lookup table to define the list of fields to check for each event:
``` call the backend ```
| lookup datamodels_fieldsquality_simple.csv datamodel, nodename OUTPUT fields_list as quality_fields_list
| trackmefieldsquality fields_to_check_fieldname="quality_fields_list" pretty_print_json=False output_mode=json metadata_fields="datamodel,nodename" include_field_values=True
Centrally managing fields quality in a lookup table: example with a JSON dictionary model
In this example, we will define a lookup table that contains the JSON dictionary model for each data model and node name, you can of course adapt this example and go beyond this such as specialising the fields to check with sourcetype, etc.
Our lookup table has 3 fields:
datamodel
: the data model of the eventnodename
: the node name of the eventjson_dict
: the JSON dictionary model for the fields to check for the event
The following JSON example shows our Web data model:
{
"action": {
"name": "action",
"allow_unknown": false
},
"bytes": {
"name": "bytes",
"regex": "\\d+",
"allow_unknown": false
},
"dest": {
"name": "dest",
"allow_unknown": false
},
"site": {
"name": "site",
"regex": ".*",
"min_len": 1,
"allow_unknown": false,
"type": "str"
},
"src": {
"name": "src",
"regex": ".*",
"allow_unknown": false
},
"status": {
"name": "status",
"regex": "\\d+",
"allow_unknown": false
},
"url": {
"name": "url",
"regex": ".*",
"allow_unknown": false,
"type": "str"
},
"url_length": {
"name": "url_length",
"regex": "\\d+",
"min_len": 1,
"allow_unknown": false
}
}
Our lookup table looks like this:

When calling the command trackmefieldsquality
, we can use:
``` call the backend ```
| lookup datamodels_fieldsquality_dict.csv datamodel, nodename OUTPUT json_dict as quality_json_dict
| trackmefieldsquality fields_to_check_dict_fieldname="quality_json_dict" pretty_print_json=False output_mode=json metadata_fields="datamodel,nodename" include_field_values=True
Final collect example
In this example, we monitor the fields quality of the following data models:
Authentication
Web
Network_Traffic
The following searches use the simple lookup table to define the list of fields to check for each data model:
Authentication:
| datamodel Authentication Authentication flat strict_fields=false summariesonly=t
``` custom metadata to identify the datamodel ```
| eval datamodel="Authentication", nodename="Authentication"
``` call the backend ```
| lookup datamodels_fieldsquality_simple.csv datamodel, nodename OUTPUT fields_list as quality_fields_list
| trackmefieldsquality fields_to_check_fieldname="quality_fields_list" pretty_print_json=False output_mode=json metadata_fields="datamodel,nodename" include_field_values=True
``` call collect ```
| collect index=summary sourcetype=trackme:fields_quality
Network_Traffic:
| datamodel Network_Traffic All_Traffic flat strict_fields=false summariesonly=t
``` custom metadata to identify the datamodel ```
| eval datamodel="Network_Traffic", nodename="All_Traffic"
``` call the backend ```
| lookup datamodels_fieldsquality_simple.csv datamodel, nodename OUTPUT fields_list as quality_fields_list
| trackmefieldsquality fields_to_check_fieldname="quality_fields_list" pretty_print_json=False output_mode=json metadata_fields="datamodel,nodename" include_field_values=True
``` call collect ```
| collect index=summary sourcetype=trackme:fields_quality
Web:
| datamodel Web Web flat strict_fields=false summariesonly=t
| search index=webserver
``` custom metadata to identify the datamodel ```
| eval datamodel="Web", nodename="Web"
``` call the backend ```
| lookup datamodels_fieldsquality_simple.csv datamodel, nodename OUTPUT fields_list as quality_fields_list
| trackmefieldsquality fields_to_check_fieldname="quality_fields_list" pretty_print_json=False output_mode=json metadata_fields="datamodel,nodename" include_field_values=True
``` call collect ```
| collect index=summary sourcetype=trackme:fields_quality
This looks like the following:

Phase 2: Monitor & Alert
This is the simplest part of the work!
In short, we will:
Create a new TrackMe Virtual Tenant dedicated to the purposes of monitoring the fields quality.
The Virtual Tenant enables the TrackMe Flex Objects (splk-flx) component.
We will create a Flex Object tracker using the out-of-the-box use case called
splk_splunk_fields_quality
.The Flex Object tracker breaks our Metadata convention, automatically identifying the entities and tracking the quality of the sampling over time.
Creating the Virtual Tenant
We create a new Virtual Tenant called fields-quality
:



Creating the Flex Object tracker
Once we have a Virtual Tenant, we can create a Flex Object tracker using the out-of-the-box use case called splk_splunk_fields_quality
:

We create a new Flex Object tracker using the out-of-the-box use case called splk_splunk_fields_quality
:


After a first execution, entities are created and ordered by Data Model:

Key Performance Indicators start to be collected:


The default behavior will let you enabling and defining a threshold according to your needs:

Which would lead to turning this entity to red if the threshold is exceeded:

We now have a complete and flexible solution to monitor the quality of the fields of your data models over time!
We also have access to sampled events so that we can easily build additional dashboards for investigation and analytic purposes.
Annex: Flex Object tracker source code
The following code is the source code of the Flex Object tracker SPL logic:
index=summary sourcetype=trackme:fields_quality
``` for later on ease of use ```
| rename "summary.list_fields_passed{}" as summary.list_fields_passed, "summary.list_fields_failed{}" as summary.list_fields_failed
``` stats ```
| stats
avg(summary.percentage_passed) as summary.percentage_passed,
avg(summary.percentage_failed) as summary.percentage_failed,
values(summary.list_fields_passed) as summary.list_fields_passed,
values(summary.list_fields_failed) as summary.list_fields_failed by metadata.datamodel, metadata.nodename, metadata.sourcetype
| rename summary.* as "*", "metadata.*" as "*"
| foreach percentage_* [ eval <<FIELD>> = round('<<FIELD>>', 2) ]
``` build the list of fields that passed and failed ```
| eval all_fields = mvappend(list_fields_failed, list_fields_passed)
| eval all_fields = mvdedup(all_fields)
| eval final_state = mvmap(
all_fields,
if(
mvfind(list_fields_failed, all_fields) >= 0,
all_fields . "|failed",
all_fields . "|success"
)
)
| eval success_fields = mvfilter(match(final_state, "\|success$"))
| eval failed_fields = mvfilter(match(final_state, "\|failed$"))
| eval success_fields = mvmap(success_fields, mvindex(split(success_fields, "|"), 0))
| eval failed_fields = mvmap(failed_fields, mvindex(split(failed_fields, "|"), 0))
| fields - final_state
| eval success_fields=if(isnull(success_fields), "", success_fields), failed_fields=if(isnull(failed_fields), "", failed_fields)
``` calculate total_fields_checked, total_fields_failed, total_fields_passed ```
| eventstats dc(all_fields) as total_fields_checked, dc(failed_fields) as total_fields_failed, dc(success_fields) as total_fields_passed by datamodel, nodename, sourcetype
``` save this as parts of extra attributes ```
| eval extra_attributes = "{" . "\"success_fields\": \"" . mvjoin(success_fields, ",") . "\", \"failed_fields\": \"" . mvjoin(failed_fields, ",") . "\"" . "}"
``` set principal metadata for the flex entity ```
| eval group = datamodel
| eval object = nodename . ":" . sourcetype, alias=sourcetype
| eval object_description = "CIM Quality for DM: " . datamodel . ":" . nodename . ", sourcetype:" . sourcetype
``` gen metrics ```
``` note: if not using a remote target, we could also do the following, it cannot be on a remote target as this command is part of TrackMe ```
``` | trackmegenjsonmetrics fields="percentage_passed,percentage_failed,total_fields_checked,total_fields_failed,total_fields_passed" ```
| eval metrics = "{" .
"\"percentage_passed\": " . if(isnum(percentage_passed), percentage_passed, 0) . ", " .
"\"percentage_failed\": " . if(isnum(percentage_failed), percentage_failed, 0) . ", " .
"\"total_fields_checked\": " . if(isnum(total_fields_checked), total_fields_checked, 0) . ", " .
"\"total_fields_failed\": " . if(isnum(total_fields_failed), total_fields_failed, 0) . ", " .
"\"total_fields_passed\": " . if(isnum(total_fields_passed), total_fields_passed, 0) . "}"
``` no outliers for now ```
| eval outliers_metrics="{}"
``` basic status, thresholds can be defined on a per entity basis ```
| eval status=1
| eval status_description="DM Quality: " . metrics
| eval status_description_short="% passed: " . percentage_passed . ", checked: " . total_fields_checked . ", passed: " . total_fields_passed . ", failed: " . total_fields_failed
```alert if inactive for more than 2 days```
| eval max_sec_inactive=86400*2
Annex: Per field table statistics
The following search example shows the statistics per field:
index=summary sourcetype=trackme:fields_quality
``` you can filter our metadata fields to focus on a specific sourcetype, index, etc...```
| search metadata.datamodel="Web"
``` stats ```
| fields - summary.* metadata.*
| stats first(*status) as "*status" by event_id
| rename "*.status" as "*"
``` untable ```
| untable event_id, fieldname, value
| stats count, count(eval(value=="failure")) as count_failure, count(eval(value=="success")) as count_success by event_id, fieldname
``` calculate ```
| eval pct_compliance=round(count_success/count*100, 2)
``` aggreg ```
| stats avg(pct_compliance) as avg_pct_compliance by fieldname
| foreach *pct* [ eval <<FIELD>> = round('<<FIELD>>', 2) ]
| rename fieldname as field

Annex: Looking after a specific field
The following search example shows the statistics per field:
index=summary sourcetype=trackme:fields_quality
``` you can filter our metadata fields to focus on a specific sourcetype, index, etc...```
| search metadata.datamodel="Web"
| table _time, action.*
