An OpenTSDB Retrospective

Over a year ago, I wrote about the success I’d had at $JOB with OpenTSDB. As with all success, this particular success meant we’d be punished…with more metrics! As our need for an effective TSDB grew, OpenTSDB started to creak. We’re now in the position where we are PoCing alternative metrics storage platforms to address our growing needs.

The major problems:

  1. The community feels stale and possibly dying.
    • There hasn’t been a release since 2018.
    • MRs and issues are piling up with fewer and fewer responses from upstream.
    • Security issues aren’t being patched, either. So RCEs potentially exist in every production install.
  2. Caching still doesn’t exist for OpenTSDB.
  3. OpenTSDB’s caching of HBase region locations is brittle.
    • We frequently see query failures (and write failures) when region splits happen because OpenTSDB’s region caching doesn’t register region splits, so the API attempts to reach the old (and now closed) region and fails.
  4. HBase is far too complex a storage system for how brittle it is.
    • Complex deployments (HDFS+Namenode+Journalnodes+ZK+Regionservers+Masters)
    • Regions are slow to fail over (this is tunable, but tuning brings its own challenges, as any config changes requires a service restart, which brings us right back to…)
    • Regions can get stuck during moves, which can cause serious problems in the cluster.
      • This is likely the most critical problem, because the first time we were struck with this “stuck region” problem, it took nearly 2 days to properly resolve it.
    • Cleanup (GCing closed/merged/etc. regions) takes an impressive amount of time, and can (practically) never catch up.
    • TTLs on regions do empty data but do not remove the region, requiring custom code to clean up empty regions
    • As a cluster grows, the auto balancer becomes less and less useful.
      • With 42 region servers, we’ve seen inbalance of over 100 regions between some servers without the balancer triggering.
  5. OpenTSDB has hard limits on total numbers of metrics, tag keys, and tag values
    • These can only be set at initial ingest time and attempting to change them later can break the entire table.
  6. OpenTSDB rollup management is difficult to do successfully
    • It took months to sort out a solid way to ingest data into rollup tables
    • It required writing custom software
  7. OpenTSDB suffers from cardinality problems far faster than many more modern TSDBs
    • Because each series requested in a query is another row in HBase scanned, performance drops rapidly as cardinality increases.
      • e.g. One metric with 10k potential series means 10k rows scanned per hour of data requested in HBase.
  8. OpenTSDB’s query language is rigid
    • We can’t easily perform math between multiple series without gexp, which is difficult to reason about (and frequently not supported in other tools)
  9. Other tools are no longer supporting OpenTSDB well
    • Grafana is clearly moving towards dropping support
    • Frequently the only support a tool offers is the ability to write data in OpenTSDB’s HTTP ingest format.
  10. Ecosystem tools are also dying
    • Bosun (a nice alerting tool on top of OpenTSDB) hasn’t had a release in years. Even shifting to a new maintainer hasn’t caused any serious improvement in dev velocity.
    • Another mention of Splicer, which if it was still supported, would be a great performance improvement.
  11. OpenTSDB doesn’t offer any type of tenantization.
    • All data lives in one table (or multiple tables if you have rollups), so reducing cardinality/etc. by writing group-specific data to different tenants isn’t possible.

The reality is that a large portion of the industry is moving towards Prometheus (and related ecosystem) tools.

  1. PromQL answers all the problems with OpenTSDB’s query language handily.
  2. There are many options for distributed, long-term storage engines for Prometheus-style data.
    • While all the different tools offer trade-offs, the fact that there are many solutions in the space is comforting.
  3. Exposing data in Prometheus' exposition format is very intuitive for developers, reducing friction around instrumentation.
  4. Prometheus was initially designed to use in containerized environments, so as our developers at $JOB move towards containers, we can more easily support them.

The long and short? OpenTSDB was successful when our architecture was relatively rigid and not growing very quickly. As we start to leverage instrumentation of services more heavily, we need more flexible tools that respond to our environment better. So so long OpenTSDB, you were certainly an interesting tool, but we can’t continue to use you.