Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-4574

osbuild-composer: corrupted state data ("job does not exist")

Details

    • Major
    • sst_image_builder
    • ssg_front_door
    • False
    • Hide

      None

      Show
      None
    • If docs needed, set a value

    Description

      Description of problem:
      For an unexpected reason, `composer-cli compose status` fails and the daemon shows a traceback without much details except a "job does not exist".

      Version-Release number of selected component (if applicable):
      osbuild-composer-46.3-1.el8_6

      How reproducible:
      Always from the osbuild-composer/

      {state.json,jobs}

      customer data.

      Steps to Reproduce:
      1. On a fresh RHEL 8.6, unpack the tarball into /var/lib/osbuild-composer
      2. Restart the composer daemon
      3. Run a `composer-cli compose status`

      Actual results:
      1/ Output
      ERROR: List Error: Get "http://localhost/api/v1/compose/queue": EOF

      2/ composer logs:
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: 2022/06/15 10:05:29 GET /api/v1/compose/queue
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: 2022/06/15 10:05:29 http: panic serving @: job does not exist
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: goroutine 30 [running]:
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: net/http.(*conn).serve.func1()
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:1802 +0xb9
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: panic(

      {0x56375d99b0e0, 0xc000329530})
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/runtime/panic.go:1047 +0x266
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).getComposeStatus(0xc000437d40, {0xc0005221b0, {0x0, {0x56375dae1ee0,>
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:350 +>
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).composeQueueHandler(0xc000437d40, {0x56375dac59e8, 0xc00024c000}, 0x>
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:2573 >
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc000096de0, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/vendor/github.com/juliensch>
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).ServeHTTP(0xc000437d40, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:285 +>
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: net/http.serverHandler.ServeHTTP({0xc0002e4090}, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:2879 +0x43b
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: net/http.(*conn).serve(0xc00022a000, {0x56375dacccd0, 0xc0001d5d10})
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:1930 +0xb08
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: created by net/http.(*Server).Serve
      Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:3034 +0x4e8


      Additional info:
      * stracing the issue, it seems a json file corresponding to a jobId is missing in /var/lib/osbuild-composer/jobs

      84197 10:05:29.778722 openat(AT_FDCWD, "/var/lib/osbuild-composer/jobs/ca461a0c-2e45-40b1-8b53-8a09bbf7f9a9.json", O_RDONLY|O_CLOEXEC <unfinished ...>
      :
      84197 10:05:29.779580 <... openat resumed>) = -1 ENOENT (No such file or directory) <0.000836>
      :
      84197 10:05:29.780068 write(2<UNIX:[8049596->8049597]>, "2022/06/15 10:05:29 http: panic serving @: job does not exist\ngoroutine 30 [running]:\nnet/http.(*conn).serve.func1()\n\t/usr/lib/golang/src/net/http/server.go:1802 +0xb9\npanic({0x56375d99b0e0, 0xc000329530}

      )\n\t/usr/lib/golang/src/runtime/panic.go:1047 +0x266\ngithub.com/osbuild/osbuild-composer/internal/weldr.(*API).getComposeStatus(0xc000437d40, {0xc0005221b0, {0x0,

      {0x56375dae1ee0, 0xc0003da870}

      ,

      {0xc000480000, 0x3535b, 0x36000}

      ,

      {0x0, 0x0, ...}

      , ...}, ...})\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:350 +0x2cd\ngithub.com/osbuild/osbuild-composer/internal/weldr.(*API).composeQueueHandler(0xc000437d40,

      {0x56375dac59e8, 0xc00024c000}, 0xc00064a03a, {0xc0001b6280, 0x1, 0x4})\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:2573 +0x1c5\ngithub.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc000096de0, {0x56375dac59e8, 0xc00024c000}

      , 0xc000218100)\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/vendor/github.com/julienschmidt/httprouter/router.go:387 +0x84b\ngithub.com/osbuild/osbuild-composer/internal/weldr.(*API).ServeHTTP(0xc000437d40,

      {0x56375dac59e8, 0xc00024c000}, 0xc000218100)\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:285 +0x172\nnet/http.serverHandler.ServeHTTP({0xc0002e4090}, {0x56375dac59e8, 0xc00024c000}

      , 0xc000218100)\n\t/usr/lib/golang/src/net/http/server.go:2879 +0x43b\nnet/http.(*conn).serve(0xc00022a000,

      {0x56375dacccd0, 0xc0001d5d10}

      )\n\t/usr/lib/golang/src/net/http/server.go:1930 +0xb08\ncreated by net/http.(*Server).Serve\n\t/usr/lib/golang/src/net/http/server.go:3034 +0x4e8\n", 1757 <unfinished ...>
      84196 10:05:29.780103 futex(0xc000088150, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>

      • once this issue was fixed manually, I noticed another job was causing the same issue
      • the 2 compose-uuid involved in this issue are: 0172feb8-ab86-45e0-b3dc-33f2a5252d75 and 2d9993e7-beb6-4da2-97ac-50fc9cc758bd
      • to fix this issue, I replaced the faulty jobIds by an existing one in state.json and then I ran a composer-cli compose delete <COMPOSE-UUID>. Note this process is very laborious and cannot be suggested as a workaround.
      • the only workaround we can offer for now is to delete all the content from /var/lib/osbuild-composer/* and restart the daemons. Hence raising the issue as high.

      Attachments

        Activity

          People

            osbuilders Osbuilders Bot Account
            rhn-support-cbesson Christophe Besson
            Osbuilders Bot Account Osbuilders Bot Account
            RH Bugzilla Integration RH Bugzilla Integration
            Eliane Pereira Eliane Pereira
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: