Argo Workflows or Airflow / Prefect?

Argo if you're already on K8s and want each step in a container. Airflow / Prefect for Python-native pipelines. Dagster for asset-centric. Each fits a different workflow style.

Is Argo Workflows still maintained?

Yes, actively developed in 2026. The Argo project (Workflows + CD + Events + Rollouts) is one of the biggest CNCF graduations. Strong community.

Argo Workflows in 2026 — Pipelines on Kubernetes That Actually Work

Argo Workflows runs DAG and step-based pipelines as Kubernetes-native CRDs. Each step is a container. By 2026 it’s the de-facto choice for K8s-shop pipelines and ML training jobs. This post is the working set.

Why Argo

Each step is a container — language-agnostic; reproducible.
K8s-native — uses pods; integrates with autoscaler, GPUs, etc.
DAG: parallelism, dependencies, conditionals.
Artifacts: shared between steps via S3 / GCS / Minio.
UI for monitoring runs.

Hello world

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-
spec:
  entrypoint: main
  templates:
    - name: main
      container:
        image: alpine:3.20
        command: [sh, -c]
        args: ["echo Hello from Argo"]

argo submit -n argo workflow.yaml
argo watch -n argo @latest

Pod runs; logs streamed. Fundamentals.

DAG

spec:
  entrypoint: pipeline
  templates:
    - name: pipeline
      dag:
        tasks:
          - name: extract
            template: extract-step
          - name: transform-a
            template: transform
            arguments: { parameters: [{ name: input, value: "a" }] }
            dependencies: [extract]
          - name: transform-b
            template: transform
            arguments: { parameters: [{ name: input, value: "b" }] }
            dependencies: [extract]
          - name: load
            template: load-step
            dependencies: [transform-a, transform-b]

Parallel transforms; serial load. Argo schedules based on the graph.

Parameters

spec:
  entrypoint: main
  arguments:
    parameters:
      - name: dataset
        value: "users"
  templates:
    - name: main
      inputs:
        parameters:
          - name: dataset
      container:
        image: my/etl:1.0
        command: [python, run.py]
        args: ["--dataset", "{{inputs.parameters.dataset}}"]

Inputs flow as args / env vars. Override at submit:

argo submit -p dataset=orders workflow.yaml

Artifacts

- name: extract
  outputs:
    artifacts:
      - name: data
        path: /tmp/data.json
        s3:
          bucket: my-pipelines
          key: extract-output.json
  container:
    image: my/extract:1.0
    # writes to /tmp/data.json

- name: transform
  inputs:
    artifacts:
      - name: input
        path: /tmp/input.json
  container: ...

Argo handles upload/download to/from S3. Steps share data without sharing volumes.

Retries and timeouts

- name: flaky-step
  retryStrategy:
    limit: 3
    retryPolicy: OnError
    backoff:
      duration: 1m
      factor: 2
      maxDuration: 30m
  activeDeadlineSeconds: 3600
  container: ...

Retry on errors with exponential backoff. Hard timeout per step.

Conditional execution

- name: maybe-deploy
  when: "{{steps.test.outputs.parameters.passed}} == true"
  template: deploy-step

Skip steps based on previous outputs.

Resources / GPU

container:
  image: my/training:1.0
  resources:
    requests: { cpu: "8", memory: "32Gi", nvidia.com/gpu: "1" }
    limits: { cpu: "16", memory: "64Gi", nvidia.com/gpu: "1" }
  nodeSelector:
    accelerator: nvidia-h100

Per-step GPU request. Argo schedules on appropriate node. Combined with Karpenter / cluster autoscaler: spin up GPU nodes only when needed.

For ML training pipelines: this is the killer combo.

Loops / fanout

- name: process-list
  steps:
    - - name: process
        template: process-item
        arguments: { parameters: [{ name: item, value: "{{item}}" }] }
        withItems:
          - alpha
          - beta
          - gamma

Generates one pod per item. For dynamic lists from previous step’s output: withParam referencing JSON output.

CronWorkflow

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata: { name: nightly-etl }
spec:
  schedule: "0 2 * * *"
  workflowSpec: {...}

Replaces cron-in-Kubernetes for pipeline scheduling.

Argo Events

For event-triggered workflows (S3 upload, Kafka message, webhook):

# EventSource → Sensor → triggers Workflow

Reactive pipelines. Less common pattern but powerful.

When Argo wins

K8s-shop: already running K8s for apps.
Polyglot pipelines: each step a different language.
GPU / ML: native K8s GPU integration.
Heavy parallelism: K8s scales pods.
CI for ML: train + eval + deploy as DAG.

When it doesn’t

Python-native simple workflows: Prefect / Dagster simpler DX.
Heavy data orchestration with metadata: Dagster.
Existing Airflow shop: don’t migrate just because.
Non-K8s deployment: Argo needs K8s.

Argo vs alternatives

	Strengths	Weaknesses
Argo Workflows	K8s-native; container-per-step; GPU	YAML; learning curve
Airflow	Mature; huge plugin ecosystem	Python-centric; ops-heavy
Prefect	Python DX; cloud option	Less K8s-native
Dagster	Asset-centric; data lineage	Different mental model
Temporal	Durable execution; code-first	More for app workflows than data

For ML / DevOps pipelines on K8s: Argo. For data engineering: Dagster or Airflow. For app workflows: Temporal.

See Temporal Workflow Engine .

Operational realities

Workflow controller handles DAG; pods come and go.
PostgreSQL / Mysql as state store (older versions used SQLite; not recommended).
Logs go to standard pod logs; persist via your usual log pipeline.
History: archived workflows in DB; clean up periodically.
Resource limits: workflows can OOM if generating thousands of pods. Concurrency limits.

CI / CD with Argo Workflows

# On PR
- run: argo submit ci.yaml -p git_sha=$SHA -p branch=$BRANCH --watch

Pipelines as workflows. Shareable templates. Multi-step builds, tests, deploys.

For app deploys: Argo CD is the sibling (GitOps). See GitOps with Argo CD .

Common mistakes

1. Heavy steps without resource limits

OOM-killed; debugging confusing. Set resources.

2. No retries on flaky steps

Network blips kill the workflow. retryStrategy.

3. Massive YAML

DRY: use templates and parameters. Reusable libraries.

4. Forgot to clean up history

Workflow CRDs accumulate. ttlStrategy or periodic cleanup.

5. Cron + every minute

CronWorkflow with * * * * *: pods every minute, all day. Most schedules can be coarser.

What I’d ship today

For K8s pipelines:

Argo Workflows for ML / data / CI pipelines.
CronWorkflows for scheduled jobs.
Argo Events for reactive triggers.
Karpenter + Argo for autoscaling GPU.
Standard templates in your org (DRY).
TTL strategy to clean up history.

Read this next

If you want my Argo Workflows templates (ML training, CI, ETL), it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Why Argo#

Hello world#

DAG#

Parameters#

Artifacts#

Retries and timeouts#

Conditional execution#

Resources / GPU#

Loops / fanout#

CronWorkflow#

Argo Events#

When Argo wins#

When it doesn’t#

Argo vs alternatives#

Operational realities#

CI / CD with Argo Workflows#

Common mistakes#

1. Heavy steps without resource limits#

2. No retries on flaky steps#

3. Massive YAML#

4. Forgot to clean up history#

5. Cron + every minute#

What I’d ship today#

Read this next#