What Generative AI Can and Can't Do

It seems as if everyone is convinced that the AI-pocalypse is nigh, and that legions of jobs will be swallowed whole by the machine. After all, AI can conjure up convincing text, code, images, audio, video, summarize things, explain things, and maybe even reason about things. Tons of jobs consist entirely of tasks such as these, so it doesn't seem entirely off-base to imagine that we’re only a couple more GPT’s away from mass unemployment.

This hype is half-correct. I do believe that certain tasks are poised for rapid automation. But I'd like to propose that there are also jobs that seem perched on the brink of an AI revolution, yet will prove very resistant to full automation for a while.

The tasks that current-gen generative AI is good at have two properties: tasks whose outputs are verifiable and editable. If the person using AI to create something can easily verify the quality of the output and edit the output, then it’s a good candidate for automation by AI. If either property is missing, then AI can’t really help.

Note that this won’t stay true forever. When we see a doctor, we don’t expect to be able to verify that their diagnosis is correct, but we trust it anyway. Theoretical future AI systems might work the same way. But the current methods for training LLMs and other AIs lead to systems that “hallucinate,” or confidently present made-up facts, and it’s crucial that the person using these outputs is skilled enough to discard or fix them.

Let’s walk through a couple examples of how verifiability and editability manifest in practice.

Photos and illustrations

AI systems like Midjourney, Stable Diffusion, and DALL-E can conjure images from natural language prompts, leading to questions about what this means for visual artists, illustrators, and stock photographers. Unfortunately, I believe this threat is very real, and there will be massive disruption in a very short amount of time.

Why?

Images are verifiable: even if you have no art skill, it's easy to determine whether an image meets your expectations. Many non-artists can assess if an image is a good fit, even if they couldn't create it themselves.

Images are editable: the generated image might not be an exact match, but if it's 90% there, it's relatively simple to iterate on it. Some AI tools offer tuning and editing options or let you use other images as reference points. And if AI can't quite get you all the way, there's always Photoshop to finish the job. Editing an existing image demands far less skill and time than creating one from scratch.

So overall, I do think AI is coming for some artist jobs, unfortunately. Many artists will adapt and differentiate their work from AI, but a portion of work that currently goes to stock photographers and some types of visual artists will soon be able to be fully executed by non-professionals. I’ve already seen many instances of bloggers using AI-generated stock photos, indie musicians using AI-generated album art, and corporate execs using AI-generated headshots.

Tech is so inhumanly fast at this point I have gone from seeing my entire field of digital art be born, mature, and now threatened with irrelevance in less than the course of a single career.
— Alex (@alexriesart) March 19, 2023

(Note that I don’t mean that all or even most artist jobs will get automated away tomorrow. There are tons of types of visual art or design that require a high degree of expertise to verify or edit.)

Legal work

GPT-4 can pass the bar exam. The internet is awash in legal templates and court transcripts for LLMs to scavenge as part of their training data. It seems inevitable that AI is coming for legal jobs.

However, I don’t think AI will automate away lawyers anytime soon.

Generating legal text is hard to verify. Recently, I asked my startup’s law firm to create a customized data protection agreement for our business. If I asked an LLM to do the same thing, I would have no ability to verify if the produced document is correct. For the same reason, it’s hard to edit. I’d have no idea where to start.

An actual lawyer may be able to use an LLM to accelerate their work, since LLM-generated legal text is something they are able to verify and edit.

What’s the difference between legal text and images? Legal text is something that only a lawyer can verify and edit, but images are verifiable and (to an extent) editable by anyone.

Code generation

Github Copilot churns out code. But it does so in an IDE, aimed squarely at software engineers.

When an engineer uses Copilot, the code is indeed verifiable and editable. Copilot's success is therefore hardly surprising.

Non-engineers attempting to use LLMs to craft software, however, will likely find their efforts neither verifiable nor editable. Let’s say you ask an LLM to build an interactive web application for you. Can you verify that it works? Kind of–you can poke around the website, click buttons, fill out forms, create accounts. But you won’t truly know if it works. For example, maybe it seems to work as intended, but how do you know if it’s storing passwords securely? How do you know there isn’t some bug or edge case that you didn’t think of?

Similar to legal AIs and lawyers, code AIs will make engineers faster but won’t automate them away completely.

What this means for AI-driven tools

So what does this mean for the near future? There are two scenarios:

Disruption: In some cases, like stock photography, AI will change where professionals exist in the ecosystem. Instead of stock photographers or illustrators working directly with clients, they’ll sell directly to the AI model companies, which in turn will provide client-facing tools. Companies like Adobe are reportedly working on a commercial model that looks like this.
Leverage: In other cases, like software engineering or law, AI will provide leverage, instead of replacing the function entirely. Your expertise is still required to verify and edit AI output.

This also means that the AI-enabled tools themselves need to be tailored to how their outputs will get used. Stable Diffusion and Midjourney have become so popular precisely because their outputs can be verified and edited by non-experts. Midjourney is literally only usable as a Discord bot–this would be a horrible form factor for an end user who is a professional artist, but works well for non-expert users.

On the other hand, code AIs need to be aimed at software developers. Appropriately, GitHub Copilot is built into a developer IDE rather than being delivered as a no-code black box. I bet we’ll see a lot of AI-enabled no-code tools designed to help non-developers build software. But they’ll fall short of expectations because the outputs are delivered in a format that is inherently hard to verify and edit compared to code.

What generative AI can and can’t do

Photos and illustrations

Legal work

Code generation

What this means for AI-driven tools

Startups that invest early in internal tools move faster more safely

OpenAI’s moat is stronger than you think