Is there any way to run it completely self hosted? If not, are there plans? And how will you monitize self hosted options (if it's possible)?
Pixie [1] is a similar project and offers the self hosted model you are looking for.
We also support 11 application protocols [2] with TLS handshake tracing and MQTT support coming soon (encrypted traffic tracing has been supported for a long time).
[1] https://px.dev
Perhaps you meant DISCLOSURE
subtrace run -devtools=/subtrace -- python3 -m http.server
This starts a Python server on localhost:8000 but with Subtrace. Everything except /subtrace is forwarded to the Python server like usual, but if you go to http://localhost:8000/subtrace you should see the Chrome DevTools network tab running in the browser like a regular app. Any request you send to localhost:8000 + all outgoing requests made from inside the Python app will automatically appear in that dashboard!Is it possible to mimic "subtrace.dev"? There's the 'SUBTRACE_ENDPOINT' environment variable which can be used to set the target endpoint, but is the server side open source too? And does the license grant permission for self hosting the full stack?
But it really looks useful and I'll definitely play with it to see if I put it into my toolbox.
re the Wireshark analogy: the reason I used that was because: (1) Subtrace operates at roughly the same level in the operating system stack, (2) has similar capabilities, (3) has an overlap in use-cases, and (4) has been the most effective at communicating what Subtrace is in my experience so far. I can see why the analogy is not a perfect 1:1 mapping (obligatory xkcd: https://xkcd.com/624), but naming things is hard and taglines are just names in idea space :)
I'm saying this just FYI. I haven't actually looked at what your product does, but if it were to matter to me, it'd be based on what it can offer that Wireshark can't, rather than how similar it is to Wireshark.
Reading the documentation makes it sound like this sits one or two levels above where Wireshark usually operates, which is why I think the analogy is tough.
and People who don't know how to use Wireshark want probably a better motivation to use Subtrace than 'its like wireshark' because there is a reason why they don't use it.
IMO the real UX problem is that there’s nothing in between “Wireshark” and “DevTools.”
What would be cool is a program that renders the real DevTools UI from a .pcap file (or stdin).
There’s a lot of use cases where I can capture all the traffic with Wireshark but I just want a basic DevTools interface.
Since we operate at the TCP level, we can actually handle pretty much any protocol. I have an implementation of a postgres handler in my git stash that intercepts and shows the SQL queries executed + the resulting rows alongside the HTTP request that triggered it (I still need to do some robustness and correctness testing before it's ready to merge). With a handful of other protocols like MySQL, Mongo, Redis, Kafka, or even FTP lol, I think Subtrace can cover most practical dev workloads.
Btw Subtrace can already record .pcap files today since it's just a simple TCP stream proxy, but raw network packet captures are mostly only useful when you're implementing new protocols, which 99% of the people using Docker containers today aren't doing. It's also a solved problem because you can just run `apt-get install tcpdump` inside the container.
Automatic tracing for app-level protocols that is easy to setup, works everywhere, lightweight for prod, fast to search, and can show the data in a clean interface is still insanely difficult today. That's the problem Subtrace is trying to solve.
When I came in I was hoping to see a product that actually was for container networking, not just app data flows. Again - this is a neat tool, and probably incredibly useful for people developing way up the stack like that, but a lot of us live below the bottom of a "full-stack developer's" stack. Some features I would expect in a "wireshark for Docker containers":
* Ability to inspect DNS traffic
* Ability to trace packets the enter the conainter network stack (e.g. the packet(s) generated when the server calls send() ) into the virtual interface for the namespace and through the host machine. Ideally with multiple observation points and correlation of packets in the overlay/underlay contexts (decrypting from local keys whenever possible).
* Ability to see where a packet dies between the app and exit NIC on the host. Including things like "packets delivered to this container even though the dest IP/subnet isn't in this container"
* Similar to the previous point: ability to track packets through all the NAT steps containers introduce.
* See arp traffic on virtual interfaces.
* Ability to observe the TLS handshake and gather all the parameters of the connection.
* Packet dissection and protocol session tracing for all the tunnels.
* Bonus points if you can capture weird teleports caused by ebpf programs in the network path.
I expect this because that's how I use wireshark_+ bpftrace in container environments for the most part. I've also used it to debug while implementing protocols, but that's a less common use case of packet dissection and tracing in wireshark.
What you've built is cool, and I can see it expanding in a lot of directions very, very, usefully. I just really dislike something calling itself a wireshark while not really helping with networking (and in fact - the networking has to work reasonably well for this tool to be effective).
We generate an ephemeral TLS root CA certificate and inject it into the system store. The generated certificate is entirely in-memory and never leaves the machine. To make this work without root privileges, we intercept the open(2) syscall to see if it's /etc/ssl/certs/ca-certificates.crt (or equivalent). If so, we append the ephemeral root CA to the list of actual CA certificates; if not, we let the kernel handle the file open like usual. This way, none of the other programs are affected, so only the program you start with `subtrace run` sees and trusts the ephemeral root CA.
After we get the program to trust the ephemeral root CA, we can proxy outgoing TLS connections through Subtrace transparently but also read the cleartext bytes.
All of this is fully automatic, of course.
It won't work with programs that defensively validate the cert chain but those are rare.
It won't work with programs that embed their own root cert store, which is also rare but I would guess less rare than the previous one. The usual reason to do this is to minimize OS deps, and in the case of Docker containers to save on container image size by only including the roots you care about.
But yes for the vast majority of programs it should work fine.
We still try our best by handling as much of the long tail of environments with some library/framework specific workaround (e.g. Deno bundles all TLS certs in its binary so we set the DENO_CERT env var when applicable).
Cert pinning has to read a public cert from memory, right? And a public cert has a well-known shape… and you have bpf and access to the memory…
- Pixie (https://px.dev) -- which I contribute to
- Beyla (https://github.com/grafana/beyla)
- Coroot (https://github.com/coroot/coroot)
If you are interested in the details and how the strategy for this tracing has evolved, you can learn more in this blog (https://blog.px.dev/ebpf-tls-tracing-past-present-future/).
Edit: I've reviewed the docs and it looks like you do run it on the same server. For clarity, I've used Sentry before.
Everything is exactly as secure as before Subtrace. In other words, using Subtrace doesn't make the NSA's job any easier ;)
subtrace run -devtools=/subtrace -- python3 -m http.server
Then go to http://localhost:8000/subtrace and you'll see the Chrome DevTools interface running in your browser like a regular app. Any request sent to http://localhost:8000 should appear there in realtime. Note that the -devtools flag is kinda best effort because Subtrace is intended to be used in production.We have fully on-prem installations for enterprises, of course (e.g. if you're in a highly regulated industry like healthcare).
subtrace run -devtools=/subtrace -- python3 -m http.server
Run the above command in a Linux machine, go to http://localhost:8000/subtrace and send some requests to localhost:8000 to see a stripped down version of the Subtrace dashboard working fully locally.It's like Honeycomb's wide events but even better because: (1) you can see whole request including the payload alongside the event fields, and (2) it's fully automatic and requires no code changes out of the box (you can incrementally add these tags when you find a need for each one instead of the huge upfront cost from instrumenting the hell out of your codebase).
The TLS certificate setup is more tricky but that is always going to be a pain.
Burp Proxy is another great tool that is even more powerful but harder to set up.
One of things we're thinking about is automatic method/function call tracing. Something like attaching the entire stack trace of calls done to handle the API request. Ideally using the same UI so that you can see the headers/payload that was sent and the function-level stack trace right next to each other. None of the OpenTelemetry verbosity, all of the observability!
In the hot path, Subtrace is just a dumb proxy that copies bytes. All of the processing + indexing happens offline in a different machine (in Clickhouse).
We also have an on-prem version of Subtrace for enterprises. It runs in their own AWS account without ever talking to subtrace.dev so that companies in regulated industries like healthcare can use Subtrace.
I'd probably use a postman related pitch instead. This is much closer to that and looks like a nice complement to that workflow
Show HN: Stratoshark, a sibling application to Wireshark - https://news.ycombinator.com/item?id=42793777 - Jan 2025 (50 comments)
> Stratoshark captures and analyzes system calls and logs using libsinsp and libscap, and can share capture files with the Sysdig command line tool and Falco
If you'd like to use Subtrace on Windows, it would be super helpful to understand your use-case deeply so that we build the right things in the right order. Please reach out to me at adtac@subtrace.dev, I'd love to chat!
Pixie (https://px.dev) can be installed in under 5 mins and gives this level of visibility across all applications. No need to change your application (wrap in `subtrace run`) to get instant visibility.
We also support 11 application protocols (https://docs.px.dev/reference/datatables/) with TLS handshake tracing and MQTT support coming soon (encrypted traffic tracing has been supported for a long time).
I'm working on an even simpler way where you can just `kubectl apply` a DaemonSet or a Helm chart to get automatic tracing for all pods in your cluster instantly without any code-level changes. If anyone is interested in beta testing this, email me at adtac@subtrace.dev, I'd love to understand your usecase!
For monitoring the network traffic for the whole cluster, the CNI and/or whatever ebpf-based runtime security stuff you’re using (falco, tetragon, tracee) is usually enough, but I can definitely see the usefulness of subtract for more specific debugging purposes. If run as a DaemonSet make sure to add some pod filtering such as namespace and label selectors (but I’m sure you’ve already thought about that).
That's a great suggestion. It'd be like kubectl exec-ing into a shell inside the pod, but for network activity. I think I'm going to prototype this tonight :)
> pod filtering such as namespace and label selectors
Yep, Subtrace already tags each request with a bunch of metadata about the place where it originated so that you can filter on those in the dashboard :) Things like the hostname, pod, cluster, AWS/GCP location are automatically populated, but you can also set custom tags in the config [1].
(1) If the Show HN succeeds, then they can save their Launch HN card to play later; if it doesn't succeed, they can go ahead and do a Launch HN soon afterwards.
(2) Show HN is more about the project/tech, especially when it's open-source. Launch HN is more about the startup/business.
(3) Nowadays many startups offer an open-source (or open core) version for free, then make money with a hosted version. For these, it's a good fit to first do a Show HN about the open-source offering, and then later do a Launch HN about the cloud product.
If I were to give an example of actual advice I've heard, using the terms of your comment, it might be: keep the nerds happy with free open-source software that actually works, then charge money for things that companies (especially enterprise companies) actually prefer to pay for.
The "paid hosted offering" pattern is the most common of these, since nerds might want to run their own instances but many IT departments do not.