Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
rhel-8.6.0
Description
Description of problem:
For an unexpected reason, `composer-cli compose status` fails and the daemon shows a traceback without much details except a "job does not exist".
Version-Release number of selected component (if applicable):
osbuild-composer-46.3-1.el8_6
How reproducible:
Always from the osbuild-composer/
customer data.
Steps to Reproduce:
1. On a fresh RHEL 8.6, unpack the tarball into /var/lib/osbuild-composer
2. Restart the composer daemon
3. Run a `composer-cli compose status`
Actual results:
1/ Output
ERROR: List Error: Get "http://localhost/api/v1/compose/queue": EOF
2/ composer logs:
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: 2022/06/15 10:05:29 GET /api/v1/compose/queue
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: 2022/06/15 10:05:29 http: panic serving @: job does not exist
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: goroutine 30 [running]:
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: net/http.(*conn).serve.func1()
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:1802 +0xb9
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: panic(
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/runtime/panic.go:1047 +0x266
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).getComposeStatus(0xc000437d40, {0xc0005221b0, {0x0, {0x56375dae1ee0,>
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:350 +>
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).composeQueueHandler(0xc000437d40, {0x56375dac59e8, 0xc00024c000}, 0x>
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:2573 >
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc000096de0, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/vendor/github.com/juliensch>
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: github.com/osbuild/osbuild-composer/internal/weldr.(*API).ServeHTTP(0xc000437d40, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:285 +>
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: net/http.serverHandler.ServeHTTP({0xc0002e4090}, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:2879 +0x43b
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: net/http.(*conn).serve(0xc00022a000, {0x56375dacccd0, 0xc0001d5d10})
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:1930 +0xb08
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: created by net/http.(*Server).Serve
Jun 15 10:05:29 localhost.localdomain osbuild-composer[84194]: /usr/lib/golang/src/net/http/server.go:3034 +0x4e8
Additional info:
* stracing the issue, it seems a json file corresponding to a jobId is missing in /var/lib/osbuild-composer/jobs
84197 10:05:29.778722 openat(AT_FDCWD, "/var/lib/osbuild-composer/jobs/ca461a0c-2e45-40b1-8b53-8a09bbf7f9a9.json", O_RDONLY|O_CLOEXEC <unfinished ...>
:
84197 10:05:29.779580 <... openat resumed>) = -1 ENOENT (No such file or directory) <0.000836>
:
84197 10:05:29.780068 write(2<UNIX:[8049596->8049597]>, "2022/06/15 10:05:29 http: panic serving @: job does not exist\ngoroutine 30 [running]:\nnet/http.(*conn).serve.func1()\n\t/usr/lib/golang/src/net/http/server.go:1802 +0xb9\npanic({0x56375d99b0e0, 0xc000329530}
)\n\t/usr/lib/golang/src/runtime/panic.go:1047 +0x266\ngithub.com/osbuild/osbuild-composer/internal/weldr.(*API).getComposeStatus(0xc000437d40, {0xc0005221b0, {0x0,
{0x56375dae1ee0, 0xc0003da870},
{0xc000480000, 0x3535b, 0x36000},
{0x0, 0x0, ...}, ...}, ...})\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:350 +0x2cd\ngithub.com/osbuild/osbuild-composer/internal/weldr.(*API).composeQueueHandler(0xc000437d40,
{0x56375dac59e8, 0xc00024c000}, 0xc00064a03a, {0xc0001b6280, 0x1, 0x4})\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:2573 +0x1c5\ngithub.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc000096de0, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/vendor/github.com/julienschmidt/httprouter/router.go:387 +0x84b\ngithub.com/osbuild/osbuild-composer/internal/weldr.(*API).ServeHTTP(0xc000437d40,
{0x56375dac59e8, 0xc00024c000}, 0xc000218100)\n\t/builddir/build/BUILD/osbuild-composer-46.3/_build/src/github.com/osbuild/osbuild-composer/internal/weldr/api.go:285 +0x172\nnet/http.serverHandler.ServeHTTP({0xc0002e4090}, {0x56375dac59e8, 0xc00024c000}, 0xc000218100)\n\t/usr/lib/golang/src/net/http/server.go:2879 +0x43b\nnet/http.(*conn).serve(0xc00022a000,
{0x56375dacccd0, 0xc0001d5d10})\n\t/usr/lib/golang/src/net/http/server.go:1930 +0xb08\ncreated by net/http.(*Server).Serve\n\t/usr/lib/golang/src/net/http/server.go:3034 +0x4e8\n", 1757 <unfinished ...>
84196 10:05:29.780103 futex(0xc000088150, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
- once this issue was fixed manually, I noticed another job was causing the same issue
- the 2 compose-uuid involved in this issue are: 0172feb8-ab86-45e0-b3dc-33f2a5252d75 and 2d9993e7-beb6-4da2-97ac-50fc9cc758bd
- to fix this issue, I replaced the faulty jobIds by an existing one in state.json and then I ran a composer-cli compose delete <COMPOSE-UUID>. Note this process is very laborious and cannot be suggested as a workaround.
- the only workaround we can offer for now is to delete all the content from /var/lib/osbuild-composer/* and restart the daemons. Hence raising the issue as high.
Attachments
Issue Links
- external trackers