Would love to read about experiences actually using this (I mean Mycroft in gene...

criddell · on Nov 22, 2022

> why don't we have "text assistants"

I used to use a text-based assistant service called I want Sandy and it was great. Then Twitter bought the company and they went away.

http://boingboing.net/2007/11/14/i-want-sandy-perfect.html

toqy · on Nov 22, 2022

> why don't we have "text assistants"

Siri has this as an accessibility feature since iOS 11, but might not be exactly what you're looking for

veidr · on Nov 22, 2022

That is right, and I did try it, but they made it so that if you enable that then voice input no longer works. (T_T)

rickoooooo · on Nov 22, 2022

I played with Mycroft about two years ago. I had been using a couple Google home minis for a while for the usual things (play spotify, set timers, ask the weather, control lights around the hose). They worked perfectly for that. At the time I decided to de-Google my life and take back my privacy so I went looking for something open source that would provide me more control of my data. I found Mycroft and played with it for a few months.

I was pretty excited about it. I bought a ReSpeaker 2.0, which is an embedded device that can run Linux and has a six microphone array. I designed a custom 3d-printed case to hold the ReSpeaker and a small speaker to make my own little "Jarvis" box (Iron-man reference).

My favorite part about the whole thing was the customization. I wrote a couple of skills to do some other things for me. For example, I could say "Where can I watch X?" and it would use an API to search for a TV show or movie to see where it was available on Netflix, Amazon Prime, Disney+, etc and let me know. It's always been annoying to go Google and try to figure out where I can watch something streaming online, but limited to only the services I currently subscribe to. I wrote another skill that tied into my couchpotato instance so I could say "Download the movie X" and it would go find it and download it. If it found multiple matches, it would read off the top few matches and let me choose the correct one. I even tied those skills together so if the first skill couldn't find a movie at one of my streaming services it would ask if I wanted to download it and I could simply say "yes". I also modified the code to use a custom text to speech API so I could configure Mycroft to use a custom voice.

It was all really cool and I had a lot of fun playing with it. The biggest problem I ran into was the wake word recognition. It worked mostly OK for me on the ReSpeaker from close range but I found as I moved away it went downhill. It was especially bad if I had my device playing music, which is possibly the most common thing I was using my Google Home mini for. I had hoped that the ReSpeaker would help with this, because it had the six microphone array and some built-in loopback hardware to try and cancel out any noise that that was being generated by the ReSpeaker. So any sound output to the speakers would be looped back into the ReSpeaker and could be subtracted from the microphone's input. I found that I just couldn't get it to work well, though. I think the music was causing vibrations that were overloading the microphone array and causing it to be unable to hear me through the music. It's possible it could be improved with a better hardware design to help reduce vibration caused by the device's own speaker. Maybe it works better now, two years later. I think I had configured Mycroft to use Snowboy for wake-word recognition so I could name my Mycroft something else (Jarvis).

One day the Mycroft installation just stopped working on my device after I hadn't touched it in a week or more and I never went back to figure out what was wrong. It's still sitting on the corner of my desk unplugged. If I could have got the wake-word recognition working reliably with music playing I think I would have used it a lot, but I wasn't able to at the time.

I just recently bought a smart watch with a built in "Alexa" app that allows you to send voice commands to your phone which get processed through the watch's official app. I'm instead using Gadgetbridge on Android to interface to the watch. Some kind hacker updated Gadgetbridge to add very basic support for my watch's microphone, allowing you to send the raw voice data to an external application. I'm hoping I'll be able to use this to revive my Mycroft instance and I'll just send voice commands to Mycroft from my watch/phone via a custom Android app/service. In theory, I'll be wearing the watch all the time anyway and having the microphone on my person and right next to my face should hopefully help with the speech-to-text and I won't have to worry about a wake word at all. I've only just barely started working on this, though.

adam1028 · on Nov 22, 2022

I gave up on mycroft after a long wait and built my own with respeaker and picovoice. i have 2 of them with different wake words. imo it's way better and easier than snowboy. i dont understand why people give their data to amazon to set a timer :)

check their free stuff: https://picovoice.ai/pricing/

rickoooooo · on Nov 23, 2022

You are using picovoice as the assistant? Is it en entire solution for that? Or are you running a DIY Mycroft device with picovoice as the wake word detector? I'll have to check this out but I've been trying to stick with open source technologies where I can. I don't trust that a free tier will remain free forever, but it may be worth testing out.

adam1028 · on Nov 23, 2022

oh no, i meant mark II, the speaker. I use their picovoice sdk. it has wake word and intent detectors - Porcupine and Rhino. https://picovoice.ai/docs/picovoice/ and https://ttsmp3.com/

i was going to add picovoice's speech to text with rasa https://rasa.com/ but i didnt have time, will give it a try over the holidays.

I see your point but not every open source project lives forever, like sonos killed snips.

adam1028 · on Nov 23, 2022

seems rhasspy is good too https://news.ycombinator.com/item?id=22703035

moffkalast · on Nov 22, 2022

> why don't we have "text assistants"

We do, it's called typing things into Google.

HankB99 · on Nov 22, 2022

Will that initiate actions like the voice thing does? I thought it just returned search results.

rolenthedeep · on Nov 22, 2022

Google assistant on your phone can accept text input. If you're on a relatively recent version of Android you should be able to long-press the home button, then tap the keyboard icon in the popup. Works the same as a voice prompt

moffkalast · on Nov 22, 2022

A lot of assistant functionality is just getting data from the internet, which search engines already know how to present and format in a useful way.

If you need to go to a specific spot in the house to write some text that turns on a light it seems easier to just walk to an actual light switch? For general automation then I think there are some visual block-based configurators to set up triggers for smart appliances otherwise.

mod · on Nov 22, 2022

Why do I need to go to a specific spot? I have devices that take text input on my body at all times, and I sit in front of one for most of the day.

EntropyIsAHoax · on Nov 22, 2022

This is actually how Mycroft handles it, more or less.

The wakeword ("hey Mycroft") is done on-device, but everything you say after that is sent to a speech-to-text API. That text is then routed to the appropriate skill to handle. So when you're writing the skill you only worry about the content of that text

https://mycroft-ai.gitbook.io/docs/mycroft-technologies/over...