Automatic Training Pipelines

Automatic end to end ai pipelines

Infrastructures drift: new items get added, workloads change, seasonal patterns shift. A model trained a month ago learned a baseline that may no longer match your current environment and will slowly accumulate false positives as legitimate new behavior diverges from what it remembers. To keep detections accurate, enable Automatic retraining on the model configuration (see the Models page).

What triggers a retrain

Automatic retraining is driven by the linked dataset configuration, not by a fixed time schedule. Whenever that dataset configuration produces a new dataset, every model configuration linked to it with automatic retraining enabled starts a fresh training run against the new data.

In practice this means the retraining schedule is controlled by the dataset configuration’s data extraction schedule. Shorter intervals produce more frequent retrains and consume more GPU/CPU time while longer intervals retrain less often at the cost of letting the model age. Monthly refreshes are a sensible default for most environments.

Automatic redeployment

A retrained model does not replace the old one in any active deployment by default. Each deployment points at a specific trained model, so a new training run creates a new model under the same configuration while leaving the live deployment untouched. To make retraining end-to-end automatic, enable the Automatic redeploy toggle on the deployment (see Deployments). When both flags are set, dataset regeneration triggers a new training run, and once training finishes the deployment swaps to the new model without any manual intervention.

If Automatic redeploy is left off, newly-trained models are available for you to adopt manually from the deployment details page whenever it makes sense to do so.

When the fully-automatic loop is a good default

The dataset-regenerate, retrain, and redeploy loop is a good default for Iris Pro and Iris Ultra deployments. Their ensembles keep run-to-run variance very low, so successive retrains alert on almost the same patterns and alerting behavior stays consistent across the loop.

For Iris Nano and Iris Core the same loop is riskier. Their higher run-to-run variance means each retrain can noticeably change what the model considers normal, which lets alert behavior drift over time. For production deployments on Nano or Core where alert stability matters, consider leaving Automatic redeploy off so each retrained model can be manually reviewed using one or more AI model tests, or move up to Iris Pro or Iris Ultra.

When manual retraining is needed

Even with an automatic training pipeline enabled, manual retraining is sometimes required. Automatic retraining only fires on the dataset configuration’s data extraction schedule, so a significant change to the monitored system can sit unaddressed until the next scheduled retrain. While it waits, the deployed model keeps comparing new behavior to its old baseline and accumulates false positives.

Plan a manual retrain after events like:

A new monitoring template being rolled out across the fleet, introducing items the current model has never seen.
A major infrastructure change, such as a database major version upgrade.
A permanent scaling change to a monitored service that shifts its baseline traffic characteristics.
Onboarding a large new customer or otherwise introducing a sustained change in load patterns.
A scheduled job’s schedule being moved, so the model no longer expects activity at the old times.
A maintenance window being redefined, where the new planned quiet periods should not fire as anomalies.