Uploaded image for project: 'OpenShift Pipelines'
  1. OpenShift Pipelines
  2. SRVKP-4499

pending pipeline runs when respository has no concurrency limit panics the watcher

XMLWordPrintable

    • False
    • None
    • False
    • Hide
      A small timing window exists where a golang panic can occur in the Pipelines As Code Watcher when concurrent modifications are going on to a Pipelines As Code created PipelineRun as the Watcher manipulates the Pending field when the concurrency limit is not set in the corresponding Repository object.

      Also, a concurrency limit of 0 was not being honored as unlimited concurrency.

      A work around is to set a very high number for the concurrency limit in the Repository object.
      Show
      A small timing window exists where a golang panic can occur in the Pipelines As Code Watcher when concurrent modifications are going on to a Pipelines As Code created PipelineRun as the Watcher manipulates the Pending field when the concurrency limit is not set in the corresponding Repository object. Also, a concurrency limit of 0 was not being honored as unlimited concurrency. A work around is to set a very high number for the concurrency limit in the Repository object.
    • Bug Fix
    • Pipelines Sprint Pioneers 3
    • Critical

      Watcher in Konflux prod was continually panicked on startup because pending pipeline runs existed

       

      Stack trace:

      panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1b1905b]
      goroutine 41 [running]:
      github.com/openshift-pipelines/pipelines-as-code/pkg/sync.(*QueueManager).getSemaphore(0xc0005c53c8, 0xc000a3e0a8)
      /go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/sync/queue_manager.go:55 +0x15b
      github.com/openshift-pipelines/pipelines-as-code/pkg/sync.(*QueueManager).AddListToQueue(0xc0005c53c8, 0xc000a3e0a8, {0xc000fb42e0, 0x1, 0x0?})
      /go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/sync/queue_manager.go:83 +0xf6
      github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler.(*Reconciler).queuePipelineRun(0xc00062b680, {0x28aed30, 0xc000626a20}, 0xc000fb42c0?, 0xc00039cb40)
      /go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler/queue_pipelineruns.go:48 +0x245
      github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler.(*Reconciler).ReconcileKind(0xc00062b680, {0x28aed30, 0xc0016d5800}, 0xc00039cb40)
      /go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler/reconciler.go:93 +0x5bf
      github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1/pipelinerun.(*reconcilerImpl).Reconcile(0xc0006999a0, {0x28aed30, 0xc0016d57d0}, {0xc0018ae7e0, 0x2d})
      /go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1/pipelinerun/reconciler.go:236 +0x542
      knative.dev/pkg/controller.(*Impl).processNextWorkItem(0xc0000b14a0)
      /go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/knative.dev/pkg/controller/controller.go:542 +0x4cd
      knative.dev/pkg/controller.(*Impl).RunContext.func3()
      /go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/knative.dev/pkg/controller/controller.go:491 +0x68
      created by knative.dev/pkg/controller.(*Impl).RunContext
      /go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/knative.dev/pkg/controller/controller.go:489 +0x354

       

      we need to check concurrency limit in getSemaphore, no just in the spots that call AddListToQueue

       

      sashture and I confirmed the repository objects in the namespace in question did NOT have concurrency limit set

       

      I'll code up the PR, but am assigning to sashture to deal with the hard part, which is working with enatan as needed to either get a valid nightly passing into Konflux infra-deployments, or getting a valid PAC watcher image that we can patch into Konflux prod a la https://github.com/openshift-pipelines/pipeline-service/compare/main...gabemontero:pipeline-service:override-tekton-ctrl-img

       

      Our work around of deleting the pending pipelineruns was OK for the test namesapce we encountered this with, but if this happens in an actual user's namespace, we may not have that option, or at least we would have to see exactly which pipeline runs were pending and work with the user and Konflux to see if it is OK.

       

      Hence I'm marking as Blocker in case the work arond is not viable next time.

       

      cboudjna@redhat.com 

            gmontero@redhat.com Gabe Montero
            rhtap-jira-bot RHTAP Jira Bot
            Savita .
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: