Uploaded image for project: 'Ansible Automation Platform RFEs'
  1. Ansible Automation Platform RFEs
  2. AAPRFE-1168

etcd Content Collection for Ansible

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • content
    • False
    • Hide

      None

      Show
      None
    • False
    • ANSTRAT-447 - Business Continuity of cloud-native infrastructure

      etcd is best known as Kubernetes' primary datastore used to store its configuration data, state, and metadata. In talking with some people in the field, they identified needing to take snapshots of etcd’s data for disaster recovery purposes and to perform quorum fixes. There is a command line tool, etcdctl, that consultants have utilized with Ansible shell command, but this is a bit laborious forcing “coding” in playbooks and is easily broken. Currently OpenShift does not provide any facility for these operations not does the etcd project.

      While automating things such as backups or quorumfixes is achievable today with the etcdtl command line tool along with the shell and command modules, the solution is a bit fragile requiring significant care and understanding of how to perform these operations properly. This content is intended to make automating etcd management operations more straightforward and robust with less effort.

      Proposed Solution

      Create a new collection (etcd.core) to support these operations without resorting to shell/command with command line and “programming” to extract and store data (facts). Frictionless implementation of the key use cases that abstract end users from having to manage or even think about how to do these tasks. 

      What this should not cover is writing key value pairs  to etcd directly. (At least not initially – there is a use case outside of Kubernetes itself, but it seems to be uncommon at this point.) The reasoning being a user could crash a K8s cluster doing so that this form of operation should be avoided. We don’t want to encourage that with easy automation. There may be a case for read-only access, but even that should go through the K8s  API. This could eventually change if we found enough usage of etcd outside of Kubernetes. 

      Requirements

      • Etcd 3.0 API (support of OCP 4+)
      • Functional parity with etcdtl
      • Support of leases and versions. 
      • User Experience 

      This solution should conform to the standard recommended Ansible practices. It should reduce the knowledge and time necessary to automate these use cases by abstracting implementation details and error handling and avoiding programming constructs at the play level with a concise declarative style interface. It should provide user conveniences such as reasonable parameter defaults and support of module defaults. The solution should also integrate with the Ansible Platform controller services such as its integrated credential management.

      Documentation

      All documentation related to this effort is believed to only have net new documentation and will not require any updates to existing documentation. A separate scenario guide and recommended practices should be explored for AAP subscribers. Documentation for the Ansible for Red Hat OpenShift solution blueprints may be impacted.

      Use Cases

      The following are some initial use cases this collection should target:

      • Backup/Restore etcd (snapshots)
      • etcd health checks
      • Configure etcd encryption
      • Querying etcd for key/value pairs (including previous versions) for debugging and diagnostic use. (Note this is read only)

      Out of Scope

      • Create, update and delete (remove) of key/value pairs – this is an anti-pattern in the context of Kubernetes and cloud-native systems we don't want to encourage with Ansible content.

       

       

       

            mferrari@redhat.com Massimo Ferrari
            rht-tima Timothy Appnel
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: