I've built an app that has the same goals (not operate a mouse) but approach it completely different.
Rather than try to simulate the moving the mouse itself, Shortcat [https://shortcat.app/] indexes the user interface (buttons, text fields, links, menus, etc) and enables fast fuzzy search of the interface. Type a word, abbreviations, or hints and hit Enter to click or action the element. Works almost everywhere on macOS, including browsers, Electron apps, and even iOS apps!
The goal is to minimise cognitive overhead to achieve a particular intent, so being able to type a word to hit a button, or active a deep menu item when you don't know the shortcut is quick and easy.
I'm currently working on a modal option which enables staying within Shortcat to navigate an interface, as well as chords for simulating scrolling and arrow keys.
Shortcat relies on using the Accessibility API to index UI elements however, and is dependent on how well an app or website has implemented it. One of the goals is to help improve accessibility implementations by exposing more people to its implementations and pushing for developers to fix broken or incorrectly implemented accessibility tagging.
Shortcat is macOS only for now as I haven't been able to investigate how viable doing this on Windows or Linux would be, especially on Linux considering all the different toolkits that exist.
I love this new wave of tools coming out for mouseless computer use. Chronic mouse use has destroyed my wrist so I have to avoid using it as much as possible.
I love Shortcat's approach in general, indexing the UI. However, the reliance on the Accessibility API is actually a significant downside in the real world in my experience since so many apps don't properly implement it. I feel like Warpd is a good complement to this, you could use Hint or Grid mode as a fallback when the indexing approach fails.
I wish I could use shortcat or Warpd, but unfortunately I'm on windows. Curious if anyone has any good tool recommendations for windows? Currently, I'm using:
1. Vimium for Chrome (so good, wish I could just use it across the OS).
2. Hunt and Peck: https://github.com/zsims/hunt-and-peck has been my favorite for OS-level use, a simple version of shortcat for windows. But, it's not maintained and not as slick as some of these newer tools.
If you're already using Vimium, I suggest trying qutebrowser, which takes keyboard accessibility to a whole new level, by making it a first-class feature for the entire browser.
It does basically cut out the mouse, and had a several-days learning curve for me, but after that it's pretty great. Here are some cool features, off the top of my head:
* Python-scriptable, though I haven't figured out how to use this yet.
* Bind javascript bookmarklets to a keyboard shortcut (use :bind with the jseval command)
* Toggle not only javascript, but image loading and a whole slew of other features, with a keyboard macro.
* Vertical tabs.
* All config is adjustable via commands.
* Keyboard macros like "pop tab into a new window", "clone tab", "close all other tabs", etc.
* Text selection using the keyboard.
* Quite similar keyboard dynamics to vim.
It has a built-in ad blocker, and you should run :adblock-update when you first use it.
Another browser which is similar, but which I haven't gotten into as much, is Luakit.
I have used a trackball for many years since my wrist started bothering me, and I love it. I am right-handed, and I use Logitech's 575 and MX Ergo. I prefer the Ergo, even though it is more expensive. I keep it beside me on the couch where I sit. That way my elbow makes a 90 degree angle. Very comfortable. My keyboard is on my lap and my monitor at eye level.
Nice, I've actually tried a trackball myself, but with the way I use my desk (sit/stand) it caused more problems than it solved (shoulder issues). Ergonomics is an art I suppose.
I never comment on HN however I just want to say I've downloaded your app and it's very impressive - I'm going to try and incorporate this into my workflow the best I can. Thanks!
I did charge for it a couple of years ago, however I rebuilt the whole thing from scratch after a long hiatus and hadn't had bothered to reimplement licensing because the existing options all kinda suck, and figured I'd focus the time on features and usability first. I think with the modal mode in the next release will bring it much closer to a 1.0 release.
If you bundle it and release a paid for application on the App Store, I would totally buy it and even roll it out to my staff. The magic of the App Store allows you to do company wide roll-outs quite easily.
I'm not sure if an app like Shortcat can be released on the App Store given it uses the Accessibility APIs (sandboxing etc), also the 15-30% cut they take is a bit ooooof, but I do have plans to support company/teams licensing!
+1 this is awesome! I'd like to donate if I can :)
edit: nvm me, found the option in settings (on activation show shortcuts immediately).
Quick question, I've been playing around with Shortcat for a while. When I press the activation hot-key it takes about 4 seconds for the yellow two-letter denoated highlights to show up, despite the app's text stating "found n elements in ~0.20s". Is there a config option to instantly show the yellow highlights?
Thanks! I don't have a way to take tips yet, but you can support by pushing for developers to improve their accessibility implementations when you run into issues!
I see you found the setting for that. It was a deliberate default initially as the intended way to use Shortcat is to activate Shortcat and type what you want without waiting to see hints, as this is generally faster and less mental overhead IMO, especially for fast typists and well-structured interfaces.
However, some people prefer minimal keystrokes and I get that. I'm trying to figure out the right set of defaults to make it friendly to new users while nudging people to how Shortcat is designed to be used and will be tweaking it as I go.
Oh my! I came to the comment section to ask about a Mac app that I've seen a long time ago that did this. Lo and behold, you, the author, have written the first comment. :-)
Thank you for Shortcat, I used it a long time ago and loved it. Excited to giv it another go!
Shortcat is utterly amazing. I really hope I can work this into my entire MacOS usage. You should be really proud of what you've made because this is fantastic!!
I think this paradigm along with more app developers putting all the important functions in menus is a strong contender for Maximum Intuitive Productivity
I have plans to use ML/OCR to augment results down the road but the AX APIs and ecosystem on most apps (that I encounter, at least) are generally decent. Also, OCR means it won’t understand buttons with just icons, whereas AX APIs can grab em just fine.
Thanks! It’s easily my longest running project at a decade
I agree that Wayland by design can be more secure - but judging from my personal threat model, if malicious code gets to attack X or Wayland, it's already all over.
Writing events is certainly a potential security problem.
I know in the Windows world, one of the UAC features was that a less privileged process can't send events to an elevated window.
In X11, I think last I checked most distros disable the XTEST extension by default out of security concerns. Skimming the warpd code, they are using XTEST for the X backend.
As I think of the keylogger problem, it's not really privilege escalation, is it? If you're running as the same user as all the other clients, you could ptrace(2) them and intercept their event loops. I guess there are some container-based app deployment solutions now where you could run stuff at different security levels, so maybe it's more of a legit issue now...
Then you'd need to implement it in every compositor.
Excuse me for being blunt. I don't know if you understand how shitty of a design you advocate. Solid designs do not require modifying core components to write application level features the original authors did not envision.
Excuse me for being extra blunt. I don't know if you understand how shitty of a design you advocate. Solid designs do not open users to being attacked and their credentials stolen by malicious applications, including sandboxed ones.
Moving cursor around is a compositor's domain, not some arbitrary application's that decided to fiddle with the user's input.
To you it's an arbitrary program. To the user it's a program they want to work.
An API should not be so preachy about which programs can theoretically be written. It should provide broad mechanisms.
It is very frustrating to work with people who think like you do, that 3 or 4 unrelated projects have to carve up narrow exceptions to how the platform works for every single use case, nominally because of theoretical harm of this exploit no one will write, but actually more based on your ego perception that you know better than every other developer on the planet.
So Wayland has this long list of impossible applications which are doable everywhere else. It's a prima donna.
If there is a distinction, and there might be, "completely separate" is exaggerating.
Edit:
I see that only wlroot based implementations are supported so far, and of those there are some things broken in wayfire. Perhaps this is what you're referring to?
The wayland implementation does not support Gnome Shell on wayland (arguably the most mainstream combo since it's the default in ubuntu). To support Gnome Shell on Wayland, you need to include a Gnome Shell extension which has some headaches.
X is a third implementation.
And then there is one additional implementation per Wayland compositor that is not based on wlroots
Not a full and completely separate implementation though. Just wrappers to mask differences in compositor extensions that haven't been stabilized yet. If there are none the implementation is simply impossible.
Personally I don't think Wayland will get to a good spot until one compositor "wins" and provides interfaces for modification/extension such as e.g. X-like ability to write window managers without replacing the whole compositor. At this point I see no compelling reason to switch to Wayland for my own part.
Have you actually used Wayland? Besides the obvious current issues (compatibility and/or missing features), it is way better. Even something as simple as moving and resizing windows feels noticeably faster and more responsive on Wayland compared to X.
I don't use it at the moment because of compatibility issues with some software that I use, but that's not Wayland's fault.
> Besides the obvious current issues (compatibility and/or missing features)
I mean, that's the whole biggest criticism of wayland, 15 years in and it still doesn't work (well?) with the biggest GPU brand. The criticism has always been about whether it is worth it to fragment linux land and throw away a few decades worth of work on X.
My criticism of Wayland has always been that they dropped a spec, forgot to make an actual server, and now every window manager needs to implement support for various extensions separately. It's a fine design if you assume everyone only uses Gnome, but...
It's not just throwing away a decade's worth of work on X, it's making everyone redo that work per display manager (thankfully wlroots exists as a kind of Wayland shared library).
Looks very useful! I especially like the ‘grid mode’ — I would never have thought of that idea myself. It’s just a pity it isn’t available on Windows, though I’ve previously had good experiences with Mousable [https://github.com/wirekang/mouseable].
When I first tried warp on my new keyboard, I really did not understand what was happening and thought it was bugged. The bindings came by default. I removed them and went on.
A few days later I wanted to add custom macros and had to read the docs. Skimming the warp section made me realize it's basically just recursive space positioning. I tried the mode again, and soon realized how incredibly useful this is.
It's a bit difficult for me to keep the state in my mind when going down the tree because there's no visual indicators. It's a keyboard firmware after all.
It's like AutoHotKey but designed to be more programming- rather than scripting-oriented. It is astonishingly easy to create GUIs in AutoIt in comparison, I use it to rapidly prototype UX ideas.
In fact, historically AHK is actually a fork of AutoIt.
Long time warpd user here: You should ditch grid mode for the much more efficient hint mode. Also, check out keyd by the same author. The combination of warpd/keyd easily saves me an hour of work every day.
vim-easymotion for the entire screen. I love it. Useful in numerous ways from accessibility to constrained devices to keyboard-centric navigation.
In the 1980's, there was a thread of animosity directed at GUIs and mice as productivity-killers and providing accessibility to novices that robbed power-users of expressivity and automation as features shifted towards UIs over text mode applications. I think we can agree that with necessary and sufficient software engineering and UX, CLI-UI-API parity is achievable offering an easier learning curve, varying levels of user astuteness, mental models, and expressivity to accomplish a task by having different MVC "views" or "presentations" to interact with software or systems of any sort.
On Windows speech recognition or dragon naturally speaking, there is the mouse grid functionality. It divides the screen into a grid of nine tiles, then you type a number to select one of the tiles, then the tile gets divided into nine tiles, which recurses on down until you have a single coordinate selected.
I just wish I had an easy way to do that from the numpad. That way, to move the mouse to an arbitrary location I need it to be, I could type 19432 enter and know that corresponded to the coordinates to refresh the page I am reading, that way I could use the mouse less and less as I started to memorize the 80% case of where I need the mouse to go and just bang it out on the keyboard.
Oh, neat; I'm very attached to keynav for this use case, but this is more portable. I'll have to dig into the Wayland limitations and caveats, since I thought that this was literally impossible to implement usefully there. Maybe this is one less blocker for me being able to switch now.
I use a thinkpad-style keyboard and my mouse is on the homerow.
It feels like that is much more efficient than this, as you get the precision of the mouse without having to move your hand.
I don't understand why more people don't adopt it. Is it because it's so different from a normal mouse ?
No it's not. It's pressure sensitive and about as fast or faster than a trackpad with movement. I've never had a problem with speed, even across multiple, large, monitors.
Is this not what you mean: https://www.youtube.com/watch?v=7H8o_-7bKIU? I really doubt it's as fast/accurate as a trackpad even if you master it. This tool looks to be as fast as a mouse if you master it in many situations. But you would need a direct comparison by skilled users for each to be sure. I just don't see how a mini stick will ever beat pressing two buttons.
It's definitely not as accurate as a trackpad or a mouse, but it's not to hard to get very close. The benefit of it is that it's right on home row. You don't need to move your hands to use the mouse.
On my thinkpad, I use the trackpoint and trackpad equally.
My experience is that a mouse is most accurate, followed by a trackpoint, followed by a trackpad, but then again I rarely use trackpads. I inevitably move the mouse when I take my finger off or try to press buttons. Also if I leave it enabled, I inevitably teleport the mouse around the screen with my palms, "palm detection" or no.
Sadly every non-IBM/Lenovo trackpoint I've ever used is awful (although significantly improved by putting a Thinkpad cover bit on the joystick, if you're stuck with one).
That said, having played through the SC1 campaign with a trackpoint... even the best ones are not as good as a real mouse.
It’s faster, but less accurate than a trackpad, and certainly faster than keynav (and probably the thing in the post, which is a re-implementation of keynav).
Windows has this in parts as KeyNavish, Fluent Search, Win-vind, Voice Finger, Window's accessibility's Voice Access, Window managers, etc. and still fall short.
Some people need tools like this, as an assistive technology - think RSI, Parkinsons and other issues that affect dexterity or elbow movement. Not so weird.
Thank you for explaining - as a consequence of illness and disability, I can understand the need.
But why someone would intentionally make things more difficult for themselves as a preference, I don't get. It would be like walking around in crutches when you have two perfectly healthy legs.
I like to keep both hands on the keyboard. Every mouse movement incurs the cost of reaching the right hand for the mouse, then moving the right hand back and re-finding my place on the keyboard. I don't like that constant back-and-forth movement. It breaks my flow and it can make my arm ache.
If implemented well (as [0] is), it can actually be much faster than using the mouse for certain tasks. For example, when browsing Google results, it’s a lot quicker to navigate to a result by pressing the first letter or two of its link text than dragging the mouse to click the link.
As a more common example, I only launch applications by opening a prompt (e.g. Spotlight on Mac) and typing the first couple letters of the program I’m starting. This is much faster than navigating using the mouse to the applications folder/menu/dock/taskbar etc. and clicking an icon.
I agree keyboard-based navigation is not faster for everything. Luckily, tools like this don’t prevent you from also using a mouse!
Reminds me of the times when every clickable element had an unde&rlined key and could be activated by alt-r. Then some designheads decided that it is non-beatiful and killed it.
If taken to the logical conclusion, your question extends to "why do we have keyboard shortcuts when you can just mouse there?" Taken to the illogical conclusion: "Why even have a keyboard when you can just use a mouse?"
There are times when a mouse is good, there are times when I don't want to take my hands off the keyboard and mouse for something.
> intentionally make things more difficult for themselves as a preference
No-one would do that, that would be crazy. People intentionally make things easier for themselves as a preference, and different people find different things easy or hard.
Seems unlikely you are a serious software developer, software engineer, or sysadmin. It’s well known mouse use slows you down and causes ergonomic issues.
Rather than try to simulate the moving the mouse itself, Shortcat [https://shortcat.app/] indexes the user interface (buttons, text fields, links, menus, etc) and enables fast fuzzy search of the interface. Type a word, abbreviations, or hints and hit Enter to click or action the element. Works almost everywhere on macOS, including browsers, Electron apps, and even iOS apps!
The goal is to minimise cognitive overhead to achieve a particular intent, so being able to type a word to hit a button, or active a deep menu item when you don't know the shortcut is quick and easy.
I'm currently working on a modal option which enables staying within Shortcat to navigate an interface, as well as chords for simulating scrolling and arrow keys.
Shortcat relies on using the Accessibility API to index UI elements however, and is dependent on how well an app or website has implemented it. One of the goals is to help improve accessibility implementations by exposing more people to its implementations and pushing for developers to fix broken or incorrectly implemented accessibility tagging.
Shortcat is macOS only for now as I haven't been able to investigate how viable doing this on Windows or Linux would be, especially on Linux considering all the different toolkits that exist.