splk-flx - Creating and managing Flex Trackers¶
Introduction¶
The splk-flx component (Flex Objects) allows you to turn the results of any Splunk search into fully tracked, monitored, and alerted entities within TrackMe.
Flex trackers are scheduled backend jobs that orchestrate entity discovery, KPI tracking, and lifecycle management
A Flex tracker can monitor anything you can query in Splunk: modular input statuses, Data Model Acceleration, host availability, infrastructure health, IoT devices, or any custom use case
The component ships with a library of 60+ pre-built use case templates organized by vendor and category, ready to deploy in minutes
A single tracker can discover and manage a few or many entities according to the needs
Enterprise Edition & Unlimited Edition feature
Hint
For a high-level overview of Flex Objects concepts and real-world use cases, see Flex Objects - Adapt TrackMe to any monitoring use case (splk-flx).
The Flex Objects Library¶
One of the biggest challenges with a component as flexible as Flex Objects is knowing where to start. The Flex Objects Library solves this by providing 60+ pre-built use case templates that you can deploy in minutes.
Each template includes:
A complete Splunk search with best-practice field definitions
Pre-configured KPIs and metrics
Outlier detection rules where applicable
Recommended scheduling and time ranges
Implementation comments and guidance
The Flex Objects Library with vendor and category filters. Select a use case to view its full definition including the search logic, metrics, and configuration.
Available categories¶
The library organizes use cases by vendor and category:
Category |
Count |
Examples |
|---|---|---|
Splunk Infrastructure |
17 |
KVstore status/size, cluster health, SHC status, CPU/memory usage, queue filling, HEC errors, bundle size, log level variations |
Data Collection & Quality |
10 |
Host tracking, deployment server clients, UF version tracking, dynamic sourcetype detection, volume variations, events drop detection, fields quality |
License Management |
5 |
Pool usage, global enterprise/cloud usage, per-index usage tracking |
Splunk Cloud |
4 |
SVC usage (global, per-app, per-consumer service), storage usage |
Data Model Acceleration |
1 |
DMA completion, size, runtime, and bucket monitoring |
SOAR (Splunk Automation) |
10 |
Asset health, service health, concurrent playbooks, infrastructure load/memory, action/playbook failures, automation brokers, Splunk forwarding |
Cribl LogStream |
8 |
Input/output health, CPU usage, destination pressure, pipeline metrics, route traffic, total traffic |
Host Monitoring |
4 |
Linux/Windows CPU usage, Linux/Windows memory usage (via TA-nix and TA-windows) |
Hint
Loading a use case from the library
When creating a new Flex tracker, select “Vendor” and “Category” to filter the available use cases. Once you select a use case, its full definition is displayed including the search logic, expected metrics, and implementation comments. You can then customize the search to match your environment before deploying.
The Flex Search Contract¶
At the heart of every Flex tracker is a Splunk search that produces entities. This search must follow a simple contract: each result row represents one entity, and specific field names carry specific meanings.
This is what makes Flex Objects both powerful and approachable - if you can write a Splunk search, you can create a Flex tracker.
The Flex Objects field contract showing required and optional fields with their descriptions and examples.
Required fields¶
Every Flex search must produce these two fields:
Field |
Type |
Description |
|---|---|---|
|
String |
The unique identifier of the entity. Any ASCII character is accepted. This is the primary key that TrackMe uses to track the entity over time. Example: |
|
Integer |
The health status of the entity: 1 = green (healthy), 2 = orange (warning), 3 = red (critical). If set to 0, the entity state is considered unknown. |
Optional fields¶
These fields enrich the entity with metadata, KPIs, and anomaly detection:
Field |
Type |
Description |
|---|---|---|
|
String |
Logical grouping of entities. Groups can span multiple trackers but overlap should be avoided. If not set, the tracker name is used as the group. |
|
String |
A human-readable display name for the entity, shown in the UI. |
|
String |
Describes the entity for easier management. Example: |
|
String |
Explains the current status condition. Example: |
|
JSON String |
A JSON object containing one or more Key Performance Indicators (KPIs). Example: |
|
JSON String |
A JSON object defining Machine Learning Outlier detection rules per metric. Example: |
|
Integer |
Maximum number of seconds an entity can be inactive before turning red. Set to |
|
String |
The primary metric to display in the entity table. Example: |
|
JSON String |
A threshold rule applied to the default metric for status determination. |
Metrics, thresholds, and ML Outliers configuration - fully UI-driven for effortless setup.
Hint
Fully UI-driven configuration
Starting with TrackMe 2.3, the configuration of metrics, thresholds, and ML Outliers detection is entirely managed through the user interface. While the search contract fields remain available for programmatic definition, you can now configure every aspect of your KPIs and anomaly detection directly from the tracker UI - making it easier than ever to fine-tune your monitoring without editing SPL.
Understanding the metrics JSON¶
The metrics field is a JSON object where each key is a metric name and each value is a numeric measurement:
| eval metrics = "{'dma.complete_pct': " . complete_pct . ", 'dma.size_mb': " . size_mb . ", 'dma.runduration_sec': " . round(runDuration, 2) . "}"
When metrics are provided:
A metric object is automatically created in the Splunk metrics index for each KPI
Metric names are prefixed with
trackme.splk.flx.in the metrics indexYou can query these metrics with
mstatsfor dashboards, reports, and further analysis
Understanding the outliers_metrics JSON¶
The outliers_metrics field defines which metrics should be monitored for anomalies using Machine Learning:
| eval outliers_metrics = "{'dma.runduration_sec': {'alert_lower_breached': 0, 'alert_upper_breached': 1, 'time_factor': 'none'}}"
Each metric entry contains:
alert_lower_breached: Set to1to alert when the metric drops below the lower threshold (e.g., event count drops)alert_upper_breached: Set to1to alert when the metric exceeds the upper threshold (e.g., runtime spikes)time_factor(optional): Defines the seasonality model for the ML algorithm:noneor omitted: No seasonality, the model uses the full history%H: Hour of day (e.g., “traffic is normally lower at 3 AM”)%H%M: Hour and minute granularity%w%H: Day of week + hour (e.g., “Monday mornings are different from Friday afternoons”)%w%H%M: Day of week + hour + minute for the finest granularity
period_calculation(optional): How far back to look for the ML model training. Example:-90dfor 90 days of history.
Creating a Flex Tracker¶
TrackMe provides a guided wizard that walks you through the entire creation process. You can either start from a library template or define a custom search from scratch.
Step 1: Name and deployment¶
Choose a name for your tracker identifier and select the Splunk deployment (local or a configured remote account).
The tracker name is used as a prefix for entity grouping and must be unique within the tenant.
Step 1: Enter the tracker name and select the deployment. The name is validated for uniqueness.
When using a remote deployment, TrackMe automatically validates the connectivity to the remote Splunk instance:
Remote deployment connectivity check showing a successful connection to the remote Splunk instance.
Step 2: Define the search logic¶
Enter your Splunk search or load one from the library. The search must produce at least the required fields (object and status).
Step 2: Enter or paste your Splunk search logic. The wizard displays the field requirements and examples for reference.
Step 3: Set time ranges¶
Define the earliest and latest time ranges for the search. Some searches (like those using | rest) don’t rely on time ranges, in which case you can set a very short window.
Step 3: Configure the search time range. Use short ranges like -5m for REST-based searches, or longer ranges for tstats/stats-based searches.
Step 4: Test and review¶
Before creating the tracker, simulate the search to verify the results. The simulation shows the raw JSON output for each discovered entity, including the parsed metrics and outliers:
Step 4: Simulate the search. Click “Simulate the search” to preview the entities that will be discovered.
The simulation results showing discovered entities with their metrics and outliers_metrics successfully parsed.
Step 5: Schedule and create¶
Set the cron schedule for the tracker execution. TrackMe automatically randomizes the schedule to distribute load across your environment.
Step 5: Define the cron schedule. A typical every 5 minutes schedule is randomly distributed for optimization.
Finally, validate and create the tracker:
Promote to Live and create the new Flex tracker.
Success! The new tracker has been created with all its associated knowledge objects (reports, macros, lookups).
KPIs and Metrics¶
When your Flex search defines the metrics field, TrackMe automatically:
Stores the KPIs as part of the entity record for display in the UI
Ingests metrics into the Splunk metrics index (
trackme_metrics) for trending and analysisSets a default metric for the entity table view if
default_metricis defined
Viewing KPIs in the entity detail¶
Click on any Flex entity to open its detail view. The Overview entity tab shows the current status, description, and key metrics:
Entity detail view showing a CIM Data Model Acceleration entity with its metrics (dma.complete_pct: 99.30%) and status information.
Querying Flex metrics¶
All Flex metrics are available in the trackme_metrics index with the prefix trackme.splk.flx.. You can query them with mstats for dashboards and reports:
| mstats avg(trackme.splk.flx.dma.complete_pct) as avg_complete,
max(trackme.splk.flx.dma.size_mb) as max_size
where index=trackme_metrics tenant_id="my-tenant" object_category="splk-flx"
by object span=1h
Exploring metric events¶
You can also explore the raw metric events ingested by TrackMe for your Flex entities:
Splunk search showing Flex metric events in the trackme_metrics index with individual metric fields extracted.
Machine Learning Outlier Detection¶
Flex Objects integrates with TrackMe’s Machine Learning Outlier detection to automatically identify anomalies in your KPIs. This is particularly powerful for use cases where you cannot define a static threshold - like event count variations that follow seasonal patterns.
How it works¶
When you define outliers_metrics in your Flex search, TrackMe:
Builds an ML model for each metric, learning the normal behavior over time
Calculates dynamic thresholds (upper and lower bounds) based on historical patterns
Detects anomalies when the current value falls outside the expected range
Triggers alerts based on your configuration (lower breach, upper breach, or both)
The time_factor parameter is key to accuracy: it tells the ML model about the seasonality of your data. For example, if your event counts are naturally lower on weekends, using %w%H (day of week + hour) ensures the model doesn’t flag normal weekend dips as anomalies.
Viewing outlier detection¶
The Outliers anomaly detection tab in the entity detail shows the ML model’s predictions alongside actual values:
Outlier detection view showing the ML model’s predicted range (blue band) and actual values over time. The “0 outliers” indicator confirms no anomalies were detected in this window.
The chart displays:
The actual metric values as a line chart
The predicted normal range as a shaded band (upper and lower bounds)
The number of outliers detected in the current time window
A time range selector to zoom in or out on the data
You can manage the ML models, select which metric to visualize, and fine-tune the detection parameters directly from this view.
Hint
Outlier detection best practices
Start with
time_factor: 'none'for metrics without clear seasonality (e.g., DMA completion percentage)Use
time_factor: '%w%H'for metrics with weekly patterns (e.g., event counts, search activity)Use
time_factor: '%H'for metrics with daily patterns but no weekly variationAllow sufficient training time (at least 7 days, ideally 30+) before relying on outlier alerts
Use
period_calculation: '-90d'to limit the ML training window and adapt faster to changes
High Scale and Custom Grouping¶
Flex Objects is designed to handle high-scale environments with thousands of entities. Here are key strategies for managing large deployments effectively.
Custom grouping strategies¶
The group field in your Flex search determines how entities are organized. Smart grouping helps you:
Scope monitoring - Different groups can have different alert thresholds and SLA policies
Organize at scale - Group entities by function, environment, tier, or business unit
Avoid overlap - Each entity should belong to exactly one group to prevent duplicate tracking
For example, in a Splunk infrastructure monitoring scenario, you might create separate groups for:
| eval group = case(
tier=="indexers", "infra-indexers",
tier=="search_heads", "infra-search-heads",
tier=="heavy_forwarders", "infra-hf",
1=1, "infra-other"
)
Entity naming conventions¶
The object field is the unique identifier for each entity. A common best practice is to prefix the object name with the group:
| eval object = group . ":" . host
This produces entity names like infra-indexers:idx01, infra-search-heads:sh01, making it easy to identify entities at a glance.
Performance considerations¶
Optimize your search - Use
tstatsover raw searches when possible, as it’s orders of magnitude faster for large datasetsTune the time range - Keep the search window as narrow as possible (e.g.,
-5mfor REST-based searches,-30mto-2hfor tstats)Schedule wisely - A 5-minute cron schedule is typical, but less critical use cases can use longer intervals (e.g.,
*/15 * * * *)Use max_sec_inactive - Set appropriate inactivity thresholds to avoid false positives for entities that naturally go quiet
Batch processing - TrackMe automatically uses batch KVstore operations and parallel processing for optimal performance at scale
Converging Trackers¶
While standard Flex trackers (use_case type) create entities from a Splunk search, converging trackers take a different approach: they aggregate the status of multiple existing Flex entities into a single availability metric.
This is particularly useful for:
Service-level monitoring - Aggregate the health of all components of a service into a single entity (e.g., “Authentication Service: 98.5% available”)
Executive dashboards - Provide a high-level view of complex multi-entity monitoring
SLA tracking - Measure overall availability across a group of entities
A converging tracker calculates the percentage of entities in a healthy state within a specified group, providing a single aggregated status and availability metric.
Managing Flex Trackers¶
The Manage existing flex trackers interface allows you to view, inspect, and delete Flex trackers:
The Manage flex trackers interface showing existing trackers with their knowledge objects (reports, properties, schedule).
From this interface you can:
View tracker properties - See the root constraint, cron schedule, time ranges, and associated reports
Select trackers for deletion - Use the checkboxes to select one or more trackers for removal
Access the executor job - Monitor the TrackMe executor job for tracker operations
Selecting trackers for deletion. Multiple trackers can be selected and removed at once.
Important
TrackMe keeps records of all knowledge objects related to a tracker. You must manage the tracker lifecycle through TrackMe - do not manually delete scheduled reports or lookups outside of the interface.
When a tracker is deleted, its associated knowledge objects are purged. However, entities previously discovered by the tracker are not automatically removed - they will simply no longer be maintained.
Deleting a Flex Tracker through REST¶
You can also delete a tracker programmatically via the REST API:
| trackme mode=post url="/services/trackme/v2/splk_flx/admin/flx_tracker_delete" body="{'tenant_id': 'mytenant', 'hybrid_trackers_list': 'Okta:prod'}"