Jun 01, 2025

The next generation of automation

Here's a relatively un-flashy video, but stick with me here because I think this is really exciting stuff.

0:00

/0:52

What we just watched was me ask Claude for the Mac to look at my email inbox, find the emails with feedback on my iOS app, and create reminders to work on those in Apple Reminders. Given it took about 45 seconds to do this and that it required me to manually key in what I wanted it to do, this is a proof-of-concept more than something I can immediately use day to day, but I'm really excited about this.

To get this working, all I had to do was install Hypercontext on my Mac, give it access to my email and Reminders, and it was good to go. I believe what this app does is set up a local MCP server on your Mac which can then be used by any app that can work with MCP. In this video it’s Claude, but it could be any LLM (including local models) that works with tooling like this.

But like I said, this is very much a proof of concept demonstration of what the tech can do, it’s not what I expect everybody to do. What does the productized, mainstream version of this look like? Consider the filters feature in Gmail, a feature that’s constantly grinding on trillions of emails to sort them how their users have requested, applying labels, auto-archiving, and doing all sorts of other things (I wonder how much energy that’s using). Businesses often have the results of these rules kick off automations (such as when a specific tag is added to an email) which then does something else in their software stack. This automation is super powerful, but it’s also rules-based which means you need to tell the software explicitly what to do, and for all of these services to be able to talk to each other. Zapier, for example, has made a massive business around making this sort of thing easier on the web, and Apple’s Shortcuts has done good work making automations more accessible on personal devices.

What excites me about this proof-of-concept is the idea that I could write “rules” in plain language, those requests could be run on a regular interval, and new feedback emails would automagically turn into tasks wherever I need them. I used Reminders in this demo since Hypercontext doesn’t support any old app now, but there are other apps like Sky that are working on that for the desktop and Zapier has done a lot of that for the web as well. In my ideal version of this, those tasks would appear in Things or my ticketing system and assigned to the appropriate team if I was a larger business.

You can start to spiral pretty quickly from here and see the potential future where these tools are strung together do even more. Why stop at making tasks when feedback is received? If it’s a bug, why not have another sequence pass the request into Cursor to make a new branch in the code base, try to fix the issue reported, and alert me when the branch is ready in GitHub for me to review and test? Why not have it try to build out new feature requests as well? A big part of receiving feedback on software today is prioritizing fixes and features since you don’t have infinite time to develop everything. In a world like this where a user can request a feature, your LLM automation can build it out in the background, and you can test it whenever you want to see if it’s actually useful, your sunk cost on trying new things suddenly dropped significantly. Of course, these tools are far from perfect and this all sounds better in theory than how it all works in practice today, so I’m not saying this is all going to happen imminently (especially in massive code bases where LLMs are nowhere near capable of making smart engineering and product decisions today), but I can see why some people are really losing their minds over the possibilities here.

Federico Viticci recently said this when talking about Sky:

The most important thing to understand about Sky is that, thanks to LLMs, you don’t need to be a power user of the app to make your interactions with macOS faster. The Software Applications Incorporated team seems to have learned from their experience with Workflow and Shortcuts. There is virtually no learning curve to Sky; you can just start typing to get things done.

I often find myself extolling the virtues of technology's "democratizing" power. Things that were hard yesterday can be done easier today, and that's almost always a good thing. Like all new technology, LLMs come with a lot of baggage (more than most, I'll fully admit), but stuff like this gets me really excited about the possibilities here. This isn't about generating slop, it's about making things that used to be hard if not impossible for most people and turning them into something they can just ask their computer to do and it'll do it.