1 comments

  • severn-lortie10 hours ago
    This is genuinely fascinating research. The fact that this is basically just a fine-tuned Gemini model being fed a bunch of camera feeds and raw sensor data, and it’s outperforming existing methods, is wild. The ability for the model to explain (in plain English!) the reasons for its decisions is hugely important for pushing V&V forward with end-to-end solutions. There are big computational demands for this technology, and the amount of data that can be processed is limited. However, these kinds of restrictions usually get solved quickly (think about how much better the perf of small LLMs has gotten in the last year).