splk-flx - Creating and managing Flex Trackers

Introduction

The splk-flx component (Flex Objects) allows you to turn the results of any Splunk search into fully tracked, monitored, and alerted entities within TrackMe.

  • Flex trackers are scheduled backend jobs that orchestrate entity discovery, KPI tracking, and lifecycle management

  • A Flex tracker can monitor anything you can query in Splunk: modular input statuses, Data Model Acceleration, host availability, infrastructure health, IoT devices, or any custom use case

  • The component ships with a library of 60+ pre-built use case templates organized by vendor and category, ready to deploy in minutes

  • A single tracker can discover and manage a few or many entities according to the needs

  • Enterprise Edition & Unlimited Edition feature

Hint

For a high-level overview of Flex Objects concepts and real-world use cases, see Flex Objects - Adapt TrackMe to any monitoring use case (splk-flx).

The Flex Objects Library

One of the biggest challenges with a component as flexible as Flex Objects is knowing where to start. The Flex Objects Library solves this by providing 60+ pre-built use case templates that you can deploy in minutes.

Each template includes:

  • A complete Splunk search with best-practice field definitions

  • Pre-configured KPIs and metrics

  • Outlier detection rules where applicable

  • Recommended scheduling and time ranges

  • Implementation comments and guidance

The Flex Objects Library with vendor and category filters. Select a use case to view its full definition including the search logic, metrics, and configuration.

Available categories

The library organizes use cases by vendor and category:

Category

Count

Examples

Splunk Infrastructure

17

KVstore status/size, cluster health, SHC status, CPU/memory usage, queue filling, HEC errors, bundle size, log level variations

Data Collection & Quality

10

Host tracking, deployment server clients, UF version tracking, dynamic sourcetype detection, volume variations, events drop detection, fields quality

License Management

5

Pool usage, global enterprise/cloud usage, per-index usage tracking

Splunk Cloud

4

SVC usage (global, per-app, per-consumer service), storage usage

Data Model Acceleration

1

DMA completion, size, runtime, and bucket monitoring

SOAR (Splunk Automation)

10

Asset health, service health, concurrent playbooks, infrastructure load/memory, action/playbook failures, automation brokers, Splunk forwarding

Cribl LogStream

8

Input/output health, CPU usage, destination pressure, pipeline metrics, route traffic, total traffic

Host Monitoring

4

Linux/Windows CPU usage, Linux/Windows memory usage (via TA-nix and TA-windows)

Hint

Loading a use case from the library

When creating a new Flex tracker, select “Vendor” and “Category” to filter the available use cases. Once you select a use case, its full definition is displayed including the search logic, expected metrics, and implementation comments. You can then customize the search to match your environment before deploying.

The Flex Search Contract

At the heart of every Flex tracker is a Splunk search that produces entities. This search must follow a simple contract: each result row represents one entity, and specific field names carry specific meanings.

This is what makes Flex Objects both powerful and approachable - if you can write a Splunk search, you can create a Flex tracker.

img-009267@2x.png

The Flex Objects field contract showing required and optional fields with their descriptions and examples.

Required fields

Every Flex search must produce these two fields:

Field

Type

Description

object

String

The unique identifier of the entity. Any ASCII character is accepted. This is the primary key that TrackMe uses to track the entity over time. Example: Okta:Production, my_index

status

Integer

The health status of the entity: 1 = green (healthy), 2 = orange (warning), 3 = red (critical). If set to 0, the entity state is considered unknown.

Optional fields

These fields enrich the entity with metadata, KPIs, and anomaly detection:

Field

Type

Description

group

String

Logical grouping of entities. Groups can span multiple trackers but overlap should be avoided. If not set, the tracker name is used as the group.

alias

String

A human-readable display name for the entity, shown in the UI.

object_description

String

Describes the entity for easier management. Example: Datamodel: Authentication, app: Splunk_SA_CIM, retention days: 7.0

status_description

String

Explains the current status condition. Example: acceleration is completed, Input is enabled

metrics

JSON String

A JSON object containing one or more Key Performance Indicators (KPIs). Example: {'dma.complete_pct': 100.0, 'dma.size_mb': 136.55}

outliers_metrics

JSON String

A JSON object defining Machine Learning Outlier detection rules per metric. Example: {'dma.runduration_sec': {'alert_lower_breached': 0, 'alert_upper_breached': 1}}

max_sec_inactive

Integer

Maximum number of seconds an entity can be inactive before turning red. Set to 0 to disable inactivity alerting.

default_metric

String

The primary metric to display in the entity table. Example: dma.complete_pct

default_threshold

JSON String

A threshold rule applied to the default metric for status determination.

Metrics, thresholds, and ML Outliers configuration - fully UI-driven for effortless setup.

Hint

Fully UI-driven configuration

Starting with TrackMe 2.3, the configuration of metrics, thresholds, and ML Outliers detection is entirely managed through the user interface. While the search contract fields remain available for programmatic definition, you can now configure every aspect of your KPIs and anomaly detection directly from the tracker UI - making it easier than ever to fine-tune your monitoring without editing SPL.

Understanding the metrics JSON

The metrics field is a JSON object where each key is a metric name and each value is a numeric measurement:

| eval metrics = "{'dma.complete_pct': " . complete_pct . ", 'dma.size_mb': " . size_mb . ", 'dma.runduration_sec': " . round(runDuration, 2) . "}"

When metrics are provided:

  • A metric object is automatically created in the Splunk metrics index for each KPI

  • Metric names are prefixed with trackme.splk.flx. in the metrics index

  • You can query these metrics with mstats for dashboards, reports, and further analysis

Understanding the outliers_metrics JSON

The outliers_metrics field defines which metrics should be monitored for anomalies using Machine Learning:

| eval outliers_metrics = "{'dma.runduration_sec': {'alert_lower_breached': 0, 'alert_upper_breached': 1, 'time_factor': 'none'}}"

Each metric entry contains:

  • alert_lower_breached: Set to 1 to alert when the metric drops below the lower threshold (e.g., event count drops)

  • alert_upper_breached: Set to 1 to alert when the metric exceeds the upper threshold (e.g., runtime spikes)

  • time_factor (optional): Defines the seasonality model for the ML algorithm:

    • none or omitted: No seasonality, the model uses the full history

    • %H: Hour of day (e.g., “traffic is normally lower at 3 AM”)

    • %H%M: Hour and minute granularity

    • %w%H: Day of week + hour (e.g., “Monday mornings are different from Friday afternoons”)

    • %w%H%M: Day of week + hour + minute for the finest granularity

  • period_calculation (optional): How far back to look for the ML model training. Example: -90d for 90 days of history.

Creating a Flex Tracker

TrackMe provides a guided wizard that walks you through the entire creation process. You can either start from a library template or define a custom search from scratch.

Step 1: Name and deployment

Choose a name for your tracker identifier and select the Splunk deployment (local or a configured remote account).

The tracker name is used as a prefix for entity grouping and must be unique within the tenant.

screen2.png

Step 1: Enter the tracker name and select the deployment. The name is validated for uniqueness.

When using a remote deployment, TrackMe automatically validates the connectivity to the remote Splunk instance:

screen3.png

Remote deployment connectivity check showing a successful connection to the remote Splunk instance.

Step 2: Define the search logic

Enter your Splunk search or load one from the library. The search must produce at least the required fields (object and status).

screen7.png

Step 2: Enter or paste your Splunk search logic. The wizard displays the field requirements and examples for reference.

Step 3: Set time ranges

Define the earliest and latest time ranges for the search. Some searches (like those using | rest) don’t rely on time ranges, in which case you can set a very short window.

screen9.png

Step 3: Configure the search time range. Use short ranges like -5m for REST-based searches, or longer ranges for tstats/stats-based searches.

Step 4: Test and review

Before creating the tracker, simulate the search to verify the results. The simulation shows the raw JSON output for each discovered entity, including the parsed metrics and outliers:

screen10.png

Step 4: Simulate the search. Click “Simulate the search” to preview the entities that will be discovered.

screen11.png

The simulation results showing discovered entities with their metrics and outliers_metrics successfully parsed.

Step 5: Schedule and create

Set the cron schedule for the tracker execution. TrackMe automatically randomizes the schedule to distribute load across your environment.

screen12.png

Step 5: Define the cron schedule. A typical every 5 minutes schedule is randomly distributed for optimization.

Finally, validate and create the tracker:

screen13.png

Promote to Live and create the new Flex tracker.

screen14.png

Success! The new tracker has been created with all its associated knowledge objects (reports, macros, lookups).

KPIs and Metrics

When your Flex search defines the metrics field, TrackMe automatically:

  1. Stores the KPIs as part of the entity record for display in the UI

  2. Ingests metrics into the Splunk metrics index (trackme_metrics) for trending and analysis

  3. Sets a default metric for the entity table view if default_metric is defined

Viewing KPIs in the entity detail

Click on any Flex entity to open its detail view. The Overview entity tab shows the current status, description, and key metrics:

screen1.png

Entity detail view showing a CIM Data Model Acceleration entity with its metrics (dma.complete_pct: 99.30%) and status information.

Querying Flex metrics

All Flex metrics are available in the trackme_metrics index with the prefix trackme.splk.flx.. You can query them with mstats for dashboards and reports:

| mstats avg(trackme.splk.flx.dma.complete_pct) as avg_complete,
    max(trackme.splk.flx.dma.size_mb) as max_size
where index=trackme_metrics tenant_id="my-tenant" object_category="splk-flx"
by object span=1h

Exploring metric events

You can also explore the raw metric events ingested by TrackMe for your Flex entities:

screen3.png

Splunk search showing Flex metric events in the trackme_metrics index with individual metric fields extracted.

Machine Learning Outlier Detection

Flex Objects integrates with TrackMe’s Machine Learning Outlier detection to automatically identify anomalies in your KPIs. This is particularly powerful for use cases where you cannot define a static threshold - like event count variations that follow seasonal patterns.

How it works

When you define outliers_metrics in your Flex search, TrackMe:

  1. Builds an ML model for each metric, learning the normal behavior over time

  2. Calculates dynamic thresholds (upper and lower bounds) based on historical patterns

  3. Detects anomalies when the current value falls outside the expected range

  4. Triggers alerts based on your configuration (lower breach, upper breach, or both)

The time_factor parameter is key to accuracy: it tells the ML model about the seasonality of your data. For example, if your event counts are naturally lower on weekends, using %w%H (day of week + hour) ensures the model doesn’t flag normal weekend dips as anomalies.

Viewing outlier detection

The Outliers anomaly detection tab in the entity detail shows the ML model’s predictions alongside actual values:

screen5.png

Outlier detection view showing the ML model’s predicted range (blue band) and actual values over time. The “0 outliers” indicator confirms no anomalies were detected in this window.

The chart displays:

  • The actual metric values as a line chart

  • The predicted normal range as a shaded band (upper and lower bounds)

  • The number of outliers detected in the current time window

  • A time range selector to zoom in or out on the data

You can manage the ML models, select which metric to visualize, and fine-tune the detection parameters directly from this view.

Hint

Outlier detection best practices

  • Start with time_factor: 'none' for metrics without clear seasonality (e.g., DMA completion percentage)

  • Use time_factor: '%w%H' for metrics with weekly patterns (e.g., event counts, search activity)

  • Use time_factor: '%H' for metrics with daily patterns but no weekly variation

  • Allow sufficient training time (at least 7 days, ideally 30+) before relying on outlier alerts

  • Use period_calculation: '-90d' to limit the ML training window and adapt faster to changes

High Scale and Custom Grouping

Flex Objects is designed to handle high-scale environments with thousands of entities. Here are key strategies for managing large deployments effectively.

Custom grouping strategies

The group field in your Flex search determines how entities are organized. Smart grouping helps you:

  • Scope monitoring - Different groups can have different alert thresholds and SLA policies

  • Organize at scale - Group entities by function, environment, tier, or business unit

  • Avoid overlap - Each entity should belong to exactly one group to prevent duplicate tracking

For example, in a Splunk infrastructure monitoring scenario, you might create separate groups for:

| eval group = case(
    tier=="indexers", "infra-indexers",
    tier=="search_heads", "infra-search-heads",
    tier=="heavy_forwarders", "infra-hf",
    1=1, "infra-other"
)

Entity naming conventions

The object field is the unique identifier for each entity. A common best practice is to prefix the object name with the group:

| eval object = group . ":" . host

This produces entity names like infra-indexers:idx01, infra-search-heads:sh01, making it easy to identify entities at a glance.

Performance considerations

  • Optimize your search - Use tstats over raw searches when possible, as it’s orders of magnitude faster for large datasets

  • Tune the time range - Keep the search window as narrow as possible (e.g., -5m for REST-based searches, -30m to -2h for tstats)

  • Schedule wisely - A 5-minute cron schedule is typical, but less critical use cases can use longer intervals (e.g., */15 * * * *)

  • Use max_sec_inactive - Set appropriate inactivity thresholds to avoid false positives for entities that naturally go quiet

  • Batch processing - TrackMe automatically uses batch KVstore operations and parallel processing for optimal performance at scale

Converging Trackers

While standard Flex trackers (use_case type) create entities from a Splunk search, converging trackers take a different approach: they aggregate the status of multiple existing Flex entities into a single availability metric.

This is particularly useful for:

  • Service-level monitoring - Aggregate the health of all components of a service into a single entity (e.g., “Authentication Service: 98.5% available”)

  • Executive dashboards - Provide a high-level view of complex multi-entity monitoring

  • SLA tracking - Measure overall availability across a group of entities

A converging tracker calculates the percentage of entities in a healthy state within a specified group, providing a single aggregated status and availability metric.

Managing Flex Trackers

The Manage existing flex trackers interface allows you to view, inspect, and delete Flex trackers:

screen1.png

The Manage flex trackers interface showing existing trackers with their knowledge objects (reports, properties, schedule).

From this interface you can:

  • View tracker properties - See the root constraint, cron schedule, time ranges, and associated reports

  • Select trackers for deletion - Use the checkboxes to select one or more trackers for removal

  • Access the executor job - Monitor the TrackMe executor job for tracker operations

screen2.png

Selecting trackers for deletion. Multiple trackers can be selected and removed at once.

Important

TrackMe keeps records of all knowledge objects related to a tracker. You must manage the tracker lifecycle through TrackMe - do not manually delete scheduled reports or lookups outside of the interface.

When a tracker is deleted, its associated knowledge objects are purged. However, entities previously discovered by the tracker are not automatically removed - they will simply no longer be maintained.

Deleting a Flex Tracker through REST

You can also delete a tracker programmatically via the REST API:

| trackme mode=post url="/services/trackme/v2/splk_flx/admin/flx_tracker_delete" body="{'tenant_id': 'mytenant', 'hybrid_trackers_list': 'Okta:prod'}"