TL;DR

The author, who created Timber.io (later Vector), says observability vendors have neglected the hardest question: how much of customers' telemetry is waste. After leaving Vector he analyzed production data, built an automated system to classify logs, and found roughly 40% of log data could be considered waste in his tests.

What happened

The author left an engineering role in 2016 to start Timber.io, which evolved into Vector, saw broad adoption, and was later acquired; he remained with the product for three years. Over time he grew frustrated that customers were burdened by soaring observability bills and vendors that were reluctant to help reduce costs. After departing Vector he was asked by former users to investigate. With permission to access a customer's Vector deployment and its complex configuration—sampling, aggregation, tiering and extensive regex-based filters—he explored whether much of the data could be safely removed. Using Hyperscan to scale pattern matching, he built an automated pipeline that compressed billions of log lines into thousands of semantic events and evaluated them with service context. Repeated checks on multiple services returned waste estimates around 30–60%, averaging near 40%. Rolling the analysis out helped teams clean up logging, simplify pipelines and lower costs. The author says he and colleagues are now building a solution at Tero.

Why it matters

  • Cost is the dominant pain point in observability; excessive telemetry drives bills and operational overhead.
  • Identifying and removing waste can substantially reduce vendor spend and simplify observability pipelines.
  • When vendors don't help customers trim unnecessary data, engineering teams become de facto cost controllers instead of focusing on product work.
  • Reducing noise can improve signal-to-noise for debugging, alerts and machine-assisted root cause tools.

Key facts

  • The author founded Timber.io in 2016; the product later became Vector and was widely adopted before being acquired.
  • He remained with Vector for three years after acquisition and later left to pursue other work.
  • Customers frequently face large, growing observability bills and friction with vendors over cost management.
  • In one customer's Vector environment the author analyzed complex configurations that included sampling, aggregation, storage tiering and many regex filters.
  • He used Hyperscan to compile and apply very large pattern sets at line rate to enable large-scale matching.
  • His automated process converted billions of log lines into thousands of semantic events and evaluated them with contextual metadata.
  • Across multiple services his analysis found waste rates of roughly 30% to 60%, averaging around 40% in the examined cases.
  • After sharing results and rolling the approach out incrementally, teams reduced their logging noise, simplified pipelines and lowered bills.
  • The author says he is building a product addressing this question at Tero.

What to watch next

  • Whether other observability vendors respond by offering transparent waste analysis or changing incentives: not confirmed in the source.
  • If Tero will commercially release the automated waste-analysis tooling and how it will be packaged: not confirmed in the source.
  • How broadly teams will adopt large-scale pattern matching and incremental rollouts to let engineers clean up logging (the author reports it worked in the customer engagement).

Quick glossary

  • Observability: The practice of collecting and analyzing telemetry (logs, metrics, traces) to understand system behavior and troubleshoot issues.
  • Log sampling: A technique that records only a subset of log events to reduce data volume and storage costs.
  • Cardinality: A measure of the number of distinct values a tag or label can take; high cardinality can increase storage and query cost.
  • Regex: Short for regular expression, a pattern language used to match text; commonly used to filter or categorize log lines.
  • Hyperscan: A high-performance multiple-pattern matching library used to apply many regex patterns at line rate.

Reader FAQ

What question is the author focused on?
How much of an organization's observability data is waste — i.e., data that can be dropped without harming understanding.

Did the author find an answer?
In the customer environments he analyzed, waste ranged from about 30% to 60%, averaging roughly 40%.

Did vendors help customers reduce waste?
The author reports vendors often defer responsibility, telling customers 'it's your data' and not providing actionable waste analysis.

Is the detailed implementation of the automated system available?
The author says the deep technical details are outside the scope of the post; public availability is not confirmed in the source.

This year marks a decade for me in observability. I left my engineering job in 2016 to start Timber.io, a hosted logging platform, because I thought logs could be simple…

Sources

Related posts

By

Leave a Reply

Your email address will not be published. Required fields are marked *