Details

    • Type: Task
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.8.1, 0.14.0
    • Fix Version/s: Future
    • Component/s: Core, REST
    • Labels:
      None

      Description

      We are missing APIs for changing data retentions. There is a system wide default data retention of 7 days. That can be overridden at start up with the hawkular.metrics.default-ttl system property or with the DEFAULT_TTL environment variable. Data retention can also be specified when creating a tenant. If set for the tenant, it override the system wide default and applies to all metrics owned by the tenant. Data retention can also be specified when creating a metric. If set, this would override the tenant-level retention as well as the system-wide default.

      We need similar behavior for changing data retentions. Specifically, there should be endpoints for updating it at the tenant level as well as at the metric level. We should also allow tag filters to be used to specify multiple metrics.

      Data retentions are stored in the retentions_idx table. At start up, we read that table and cache the settings in memory. We do this to avoid having to query the database every time we insert a data point. We expire data using Cassandra's TTL (that will change though once we finish HWKMETRICS-191). We set the TTL on each each write, hence the need for caching the retention settings.

      When retention settings are changed, each hawkular metrics instance needs to be made aware of the changes. We could introduce a mechanism for internode communication, but I think the simplest solution is to poll the database. Each hawkular metrics instance can run a periodic job in a background thread that reloads the settings from retentions_idx.

      We will essentially have to perform a full table scan of retentions_idx. I do not think that will be a big issue performance wise because I do not think users are introducing a lot of custom retention settings. While this might not be the most efficient or performant solution, I prefer it because we can easily back port it to the 0.8.0 branch, which is something I think we need to consider doing.

      There are more efficient polling solutions we could and probably should explore, but I do not think we need them for an initial implementation. The last thing we will need to make very clear in docs is that changing data retention will only effect new data. With Cassandra's TTL, we would have to read out all existing data and write it back with a new TTL. We certainly do not want to do that. Once HWKMETRICS-191 is done, changing the retention will apply to both new and existing data and will essentially be a constant time operation.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  john.sanda John Sanda
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated: