Monday 28 October 2024

Show HN: Trench – Open-source analytics infrastructure https://bit.ly/3C1sbKB

Show HN: Trench – Open-source analytics infrastructure Hey HN! I want to share a new open source project I've been working on called Trench ( https://bit.ly/3YmYmvy ). It's open source analytics infrastructure for tracking events, page views, and identifying users, and it's built on top of ClickHouse and Kafka. https://bit.ly/40ia0un I built Trench because the Postgres table we used for tracking events at our startup ( https://bit.ly/3YlvIel ) was getting expensive and becoming a performance bottleneck as we scaled to millions of end users. Many companies run into the same problem as us (e.g. Stripe, Heroku: https://bit.ly/3YqjP6H ). They often start by adding a basic events table to their relational database, which works at first, but can become an issue as the application scales. It’s usually the biggest table in the database, the slowest one to query, and the longest one to back up. With Trench, we’ve put together a single Docker image that gives you a production-ready tracking event table built for scale and speed. When we migrated our tracking table from Postgres to Trench, we saw a 42% reduction in cost to serve on our primary Postgres cluster and all lag spikes from autoscaling under high traffic were eliminated. Here are some of the core features: * Fully compliant with the Segment tracking spec e.g. track(), identify(), group(), etc. * Can handle thousands of events per second on a single node * Query tracking data in real-time with read-after-write guarantees * Send data anywhere with throttled and batched webhooks * Single production-ready docker image. No need to manage and roll your own Kafka/ClickHouse/Nodejs/etc. * Easily plugs into any cloud hosted ClickHouse and Kafka solutions e.g. ClickHouse Cloud, Confluent Trench can be used for a range of use cases. Here are some possibilities: 1. Real-Time Monitoring and Alerting: Set up real-time alerts and monitoring for your services by tracking custom events like errors, usage spikes, or specific user actions and sending that data anywhere with Trench’s webhooks 2. Event Replay and Debugging: Capture all user interactions in real-time for event replay 3. A/B Testing Platform: Capture events from different users and groups in real time. Segment users by querying in real time and serve the right experiences to the right users 4. Product Analytics for SaaS Applications: Embed Trench into your existing SaaS product to power user audit logs or tracking scripts on your end-users’ websites 5. Build a custom RAG model: Easily query event data and give users answers in real-time. LLMs are really good at writing SQL The project is open-source and MIT-licensed. If there’s interest, we’re thinking about adding support for Elastic Search, direct data integrations (e.g. Redshift, S3, etc.), and an admin interface for creating queries, webhooks, etc. Have you experienced the same issues with your events tables? I'd love to hear what HN thinks about the project. https://bit.ly/4foGt6G October 25, 2024 at 03:07PM

No comments:

Post a Comment