58 points | by hannesfur4 days ago
In Profinet / Ethernet based networks it's more common to use ARP or mDNS for the discovery because multicast addresses are supported everywhere. Multicast DNS would be independent on top of the existing network layer and compatible with smartphones and other consumer devices (and even printers). That's why I'm asking, was there a specific reason to not use mDNS?
Airprint, airscan, filedrop and other things are based on bonjour (mDNS-SD), and supported pretty well on consumer devices and routers. [1]
Currently, your project seems to be an opinionated wrapper ontop of libp2p. For this to become a proper distributed toolkit you lack an abstraction to for apps to collaborate over shared state (incl. convergence after partition). Come up with a good abstraction for that, and make it work p2p (e.g. delta state based CRDTs, or op-based CRDTs based on a replicated log; event sourcing ..). Tangentially related, a consensus abstraction might also be handy for some applications.
Also check out [iroh](https://github.com/n0-computer/iroh) as a potential awesome replacement for p2p; as well as [Actyx](https://github.com/Actyx/Actyx) as an inspiration of similar (sadly failed) project using rust-libp2p.
Oh, and you might want to give your docs a grammar review.
Kudos for showing!
Abstractions for collaboration are currently in the works, and we hope to release that soon. The work on consensus has already started. Your suggestions seem all very interesting, and we'll definitely consider them. We are also currently in the process of talking to potential users to build handy and approachable abstractions for them.
I saw that [freenet](https://docs.freenet.org/components/contracts.html) went with CRDTs, but I think they made it too complicated. We were thinking about a graph (or wide-column) with an engine similar to Kassandara and a frontend like (or ideally just) SurrealDB.
I remember that iroh moved away from libp2p when they dropped IPFS compatibility and moved to a self-built stack: https://www.iroh.computer/blog/a-new-direction-for-iroh When we got started, the capabilities of iroh didn't really fit our bill, but it seems like it's time to reevaluate that. As a former contributor to rust-libp2p, I never quite got the frustration with libp2p that many people have, Iroh included, especially since many of the described problems seemed fixable, and I would have preferred if they did that instead, and libp2p remains the shared base people build these things on.
I remember Actyx being a rust-libp2p user, but I wasn't aware that they failed. Do you have more info? How and why? It would be great if we could learn from them.
Grammar will be reviewed ;) thank you!
The usual problems with these things are discovery and security. Discovery is done via local WiFi broadcast. Not clear how security is done. How do you allow ad-hoc networking yet disallow hostile actors from connecting?
Security is still a big issue. In the current state, there is no security other than application-layer encryption (QUIC & TLS v1.3). That is fine for pilot projects, but it should not be used for anything sensitive. Maybe we should point this out more clearly in the docs. However, some Wi-Fi chips (not the ones on Raspberry Pi, sadly) also allow setting a password in adhoc (IBSS) mode and 802.11s has native support for encryption. In general is here a problem with lack of adoption of standards by the WiFi chip manufacturers and with Broadcom (the chip on the RP) a lack of support in the Linux kernel driver.
We are planning to implement authentication and encryption in the upcoming release, but this might be a paid feature.
Typically, enterprise networks are encrypted via 802.11x (since a leak of the key wouldn't compromise the whole network), and we might be able to build a decentralised Radius server, but I'm not very fond of that idea.
Ideally, the damage one can do by joining the network unauthorized should be very limited anyway, and authentication and encryption should happen on Layer 5.
Would love feedback / inspiration / suggestions
I wish it could be a more mainstream, hobbyist auth solution tho, it’s completely free and open and self sovereign etc etc and makes strong security guarantees, just a steep learning curve to grok what’s happening. I think it would be a big achievement if somebody slapped a friendly API / wizard over configuring a CA and creating certs to install on each of your robots / IoT sensors whathaveyou. Corsha [1] is one provider in this space, and Yubico is contributing too [2], allowing you to sign cert requests with your Yubikey.
[0] https://keystore-explorer.org/features.html
[2] https://www.yubico.com/resources/glossary/what-is-certificat...
Another way would be to somehow prove that you belong.
It's annoying that we don't have a decent solution to this even for home automation. You ought to be able to take a "house ID key", probably a Yubikey, and present it to all your devices to tell them "you're mine now". Then they can talk to each other.
There are military cryptosystems which have such hardware. There's a handheld device called the Simple Key Loader.[1] That's what's used to load secure voice keys into radios, encrypted GPS keys into GPS units, identify-friend-foe codes into aircraft, and such. It's 15 years old, runs Windows CE, has a screen with a pen, and is far too big. The Tactical Key Loader is smaller and simpler.[2] 7 buttons and a small screen. About the same size as a Flipper Zero, but ruggedized and expensive.
[1] https://info.publicintelligence.net/SKLInstructionGuide.pdf
[2] https://www.l3harris.com/all-capabilities/kik-11-tactical-ke...
Can you provide some insight as to why this would be preferred over an orchestration server? In this context - Would a 'mothership'/Wheel-and-spoke drone responsible for controlling the rest of the hive be considered an orchestration server?
This isn't my area of expertise but I think "Hive mind drones" tickles every engineer.
We are in the current YC W25 batch and our vision is to build a developer framework for autonomous robotics systems from the system we already have.
> Can you provide some insight as to why this would be preferred over an orchestration server?
It heavily depends on your application, there are applications where it makes sense and others where it doesn’t. The main advantages are that you don’t need an internet connection, the system is more resilient against network outages, and most importantly, the resources on the robots, which are idle otherwise, are used. I think for hobbyists, the main upsides is that it’s quick to set up, you only have to turn on the machines and it should work without having to care about networking or setting up a cloud connection.
> Would a 'mothership'/Wheel-and-spoke drone responsible for controlling the rest of the hive be considered an orchestration server?
If the mothership is static, in the sense that it doesn’t change over time, we would consider it an orchestration server. Our core services don’t need that and we envision that most of the decentralized algorithms running on our system also don’t rely on such central point of failure. However, there are some applications where it makes sense to have a “temporary mothership”. We are just currently working on a “group” abstraction, which continuously runs a leader election to determine a “mothership” among the group (which is fault-tolerant however, as the leader can fail anytime and the system will instantly determine another one).
To that end, I'm not clear on benefit in this model. To solve that problem I would just take a centralized framework and stick it inside an oversized drone/vehicle capable of carrying the added weight (in CPU, battery, etc.). There are several centralized models that don't require an external data connection
> the resources on the robots, which are idle otherwise, are used
But what's the benefit of this? I don't see the use case of needing the swarm to perform lots of calculations beyond the ones required for it's own navigation & communication with others. I suppose I could imagine a chain of these 'idle' drones acting as a communication relay between two separate, active hives. But the benefit there seems marginal.
> our system also don’t rely on such central point of failure
This seems like the primary upside, and it's a big one. I'm imagining a disaster or military situation where natural or human forces could be trying to disable the hive. Now instead of knocking out a single mothership ATV - each and every drone need to be removed to full disable it. Big advantage.
> We are just currently working on a “group” abstraction
Makes sense to me. That's the 'value add', might as well really spec that out
> leader election to determine a “mothership” among the group
This seems perfectly reasonable to me and doesn't remove the advantages of the disconnected "hive". But I do find it funny that the solution to decentralization seems to be simply having the centralization move around easily / flexibly. It's not a hive of peers, it's a hive of temporary kings.
> I would just take a centralized framework and stick it inside an oversized drone/vehicle capable of carrying the added weight
Makes sense. I think there are scenarios where such “base stations” are a priori available and “shielded,” so in this case, it might make more sense to just go with a centralized system. This could also be built on top of our system, though.
> But what’s the benefit of this?
I agree that, in many cases, the return on saving costs might be marginal. However, say you have a cluster of drones equipped with computing hardware capable enough to run all algorithms themselves—why spin up a cloud instance for running a centralized version of that algorithm? It is more of an engineering-ideological point, though ;)
> But I do find it funny that the solution to decentralization seems to be simply having the centralization move around easily / flexibly. It’s not a hive of peers, it’s a hive of temporary kings.
Most of our applications will not need this group leader. For example, the pubsub system does not work by aggregating and dispatching the messages at a central point (like MQTT) but employs a gossip mechanism (https://docs.libp2p.io/concepts/pubsub/overview/).
What I meant is that, in some situations, it might be more efficient (and it’s easier to reason about) to elect a leader. For example, say you have an algorithm that needs to do a matching between neighboring nodes —i.e., each node has some data point, and the algorithm wants to compute a pairwise similarity metric and share all computed metrics back to all nodes. You could do some kind of “ring-structure” algorithm, where you have an ordering among the nodes, and each node receives data points from the predecessor, computes its own similarity against the incoming data point, and forwards the received data point to its successor. If one node fails, the neighboring nodes in the ring will switch to the successor. This would be truly decentralized, and there is no single point of failure. However, in most cases, this approach will have a higher computation latency than just electing a temporary leader (by letting the leader compute the matchings and send them back to everyone). So someone caring about efficiency (and not resiliency) will probably want such a leader mechanism.
One question I have is how quickly can discovery and switching take place. For instance, if I have a robot enter a lift. Can the lift detect the robots entrance and trigger its behaviours?
I think the size estimation could also be implemented within the provided abstractions (mainly request-response) but might require you to keep track of neighbors. I think you could implement both algorithms by using our SDKs (none for Go yet).
If you need more control or performance, beyond what we expose through our SDKs, you might need to write a custom libp2p behavior and add it to our daemon. The libp2p part is fairly involved, but I would love to help you with that. Either way I would love to help you out :)
I'm so disappointed that I've never seen your blog before. The stuff you write about is so interesting and actually addresses some issues we are facing. I just sent you an email :)
We also don’t have a definitive hardware spec requirement yet. We’ve tested it on Raspberry Pi 3s and later models (so anything more capable than a 3 should be fine).
> not ESP32 for example
Running on ESP32 is tricky because it would require porting libp2p to a embedded (which, as far as we know, nobody has done yet). However, we are considering support for embedded “light” nodes that run only a limited portion of the stack. It depends on the feedback we get. Do you have a use case where you’d need it to run on embedded?
Have you ever played any of the Horizon (Zero Dawn/Forbidden West) games? :)
Jokes aside, it looks pretty cool. What kind of hardware have you tested it with so far? Is this using WiFi only?
Thank you! So far, we have tested it with Raspberry Pi 4/5. Jetson boards are on backorder. We have some Intel WiFi chips (since they support some stuff we want), and we will get around to trying them next.
The binaries were also tested on x86 machinery.
In general, I'm not too worried about hardware support since batman-adv is quite widely deployed on a diverse set of hardware and the rest is hardware agnostic.
For paid features, we have several ideas: a hosted management plane to configure and control the swarm (with company rbac integration) when one of the nodes is connected to the internet; advanced security (currently no access management or authentication is happening); sophisticated orchestration primitives; and LoRa connectivity (to scale the mesh radius to miles).
Appreciate your feedback on this!
I love the general ideas around free and powerful p2p functionality with hosted management or higher level features.