I once lived in Moscow on a compound with an adopted rescue dog. The compound had a shock collar and invisible fence setup.
Moscow’s street dogs are renowned for their intelligence. I have seen street dogs taking the escalators on the Metro. This dog worked out not just that the beeping + discomfort was worth the freedom, but also that he could wear out the battery faster by going up to the very edge of the fence - where the chirps became an uninterrupted beeeeep - and as soon as the beeping stopped, whoosh he was gone.
Fundamentally, we are at a point in time where models are already very capable, but not very reliable.
This is very interesting finding about how to improve capability.
I don't see reliability expressly addressed here, but my assumption is that these alloys will be less rather than more reliable - stronger, but more brittle, to extend the alloy metaphor.
Unfortunately for many if not most B2B use cases this reliability is the primary constraint! Would love to see similar ideas in the reliability space.
Great question. For me reliability is variance in performance and capability is average performance.
In practice high variance translates on the downside into failure to do basic things that a minimally competent human would basically never get wrong. In agents it's exacerbated by the compounding impact of repeated calls but even for basic workflows it can be annoying.
I don’t think variance is relevant to this application which is essentially a search function. As long as they find the answer 1/100, it doesn’t matter if it took them 100 tries - that’s just a cost optimization problem here.
That being said, I think variance implicitly improves in this context because this is the same as poll averaging that Nate Silver does - as long as the models are truly independent this averaging technique works as an improved result across the board (ie average and variance). However, if the models start converging with datasets and techniques this will degrade to become worse just as with polling with pollster herding and other problems the industry creates for themselves.
I had to do a MN1 application for my US-born daughter as both me and my partner were born abroad. As OP alludes to this is a sort of side quest called "registration" which if you don't do it before the child is 18 lapses (although I think there are still routes to obtain citizenship in these circs).
The most difficult part of the process (not dealt with in this version of Passport Application but maybe a future DLC pack?) was actually finding someone who could certify my evidence (you are meant to submit originals but they keep the docs including passports for 3-6 months which is a bit unrealistic if you are living abroad). I can't remember the exact rules but it wasn't possible to use a US notary or a normal solicitor certification process and instead I needed to go to a council office.
After calling about 5 councils all of whom disavowed any knowledge of the process or its requirements I ended up finding someone at Islington Council who was delightfully helpful. But it was one of the more frustrating UK government interactions I've had.
Not caused but correlated: doing something risky is much easier when you have something to fall back on - which I’d guess is a lot more common in the Ivies
For me this is great for practice (I tried Russian). However the big missing piece for all these language learning apps is the lack of support for spotting and correcting errors in your pronunciation - as long as you say the word more or less right, the transcription gives you a pass.
I am very excited for the whole STT/TTS to go away and for us to have models that really "hear" exactly what you said.
Sometimes this is about accent but a lot of the time, the AI won't spot areas where you e.g. fudge a case ending or the stress on a word. Yes, you can get some of that pronunciation right by the AI repeating back with the correct stress or clear case, but you never really get the confidence that you would get from an actual human.
Another product suggestion - turn off transcription (at least for the tutor side of the conversation; I'd suggest both). Personally I find it distracting at best for languages I already speak well and a crutch for those I don't.
Finally, I find it really very hard to enjoy having a random conversation that's not very directed ("What interests you most about artificial intelligence?"). I'd suggest that there are ways of making it more goal focused without being explicitly gamified - maybe something like, here's a position and you have to persuade me (AI debate club!), or something that brings out an actual opinion or relates to a concrete experience ("what's your main goal in your job this year").
Overall though this is the first product I've seen in this space that I might actually use, so well done.
The persuasion lesson sounds like a great idea, we haven't thought of that. Yeah voice to voice models will be amazing. There is significant progress from openai/gemini, and we plan to use them when they are ready.
- That it's still way cheaper in most instances to book a return (especially where the "trip" straddles a weekend) rather than a one-way fare when travelling long haul - even if you just throw away the return flight.
- That you can sometimes get access to totally different inventory by booking a package including accommodation, even if that accommodation is one night in a shared dormitory in a hostel (which you just don't go to).
At least group discounts have a recognizable economic rationale. But in these examples you are getting a strict superset of the same SKU (OK, maybe the change rules might be a little tighter, but not in a way that's perceptible) for less money.
I've definitely come across the one-way flight costing more than a return
My guess is the airlines think one-way people are business folks (so the price doesn't matter because it's getting expensed), whereas return travelers are paying their own way
Round trip business (with a return 6 months later) was 2,530 Swiss francs. So I screenshotted the horrible one-way price to go in my expense report, and then booked the round trip ticket.
No, I was meant to book a one-way ticket, since I was moving offices. But I had to have evidence to show that booking round-trip was cheaper in case anyone questioned why I had purchased round-trip instead of one-way.
Try London to Washington, DC and watch your eyes pop
You might be able to find an airline where it doesn't happen, but you will definitely find airlines where it does. Just verified with Delta and British airways and Lufthansa
US to Europe open jaw can be weird. I've done somewhat crazy return to origin European city (typically Heathrow) to avoid. And then I've had times when it's been perfectly reasonable.
Moscow’s street dogs are renowned for their intelligence. I have seen street dogs taking the escalators on the Metro. This dog worked out not just that the beeping + discomfort was worth the freedom, but also that he could wear out the battery faster by going up to the very edge of the fence - where the chirps became an uninterrupted beeeeep - and as soon as the beeping stopped, whoosh he was gone.
reply