How do designers feel about the rising popularity of models like Stable Diffusion, Midjourney, and DALL-E? We asked our designers. Spoiler: they are quite helpful.

Tensorway is a team of people passionate about what they do. However, the latest developments force many of us to question whether people’s involvement is irreplaceable in some spheres, given what neural networks are now capable of. Specifically, it concerns designers. Just a few months since text-to-image models Stable Diffusion, Midjourney, and DALL-E 2 launched, and thousands of thousands of articles like “Will AI replace designers?” popped up. We’re not here to answer this; instead, we invited our designers Kate and Irina to speak up and share their thoughts on whether the mentioned models are aiming to remove them from their posts or are actually helpful.

‍

Read on if you want to learn:

which model — they’ve tried quite a few — is best and for what,
how AI assists in UX/UI designers’ day-to-day tasks,
and whether designers (at least two of them!) are feeling threatened by AI.

Before we start, any UX/UI designer will tell you how important it is to distinguish between UX and UI design. You will see how this applies to the topic further. We hate being Captain Obvious, but we just have to repeat it here as well. UI design is creating the visual component of a software product whereas UX design is thinking through how the user completes the target actions through interaction with this very product.

That’s it, let’s go!

Is this explanation correct?

Irina: Beyond perfect=)

So the Midjourney, Stable Diffusion, and DALL-E models — you’ve tried all of them. What your first impressions were?

Kate: Of course, it was “wow!” when first looking at the examples of images. And of course, I thought I would also get amazing results as one, two, three. Spoiler: I did not. It was not so easy.

Irina: But definitely worth digging into.

‍Of all, what AI image generators are the best? For what tasks?

Irina: Models differ in resulting images a lot because they were trained differently and the results are not the same in terms of style. The brightest example is that Midjourney creates ethereal, artwork-video-game-styled images; that’s because it was trained on artwork. And DALL-E is believed to be best at generating realistic life-like images. Stable Diffusion, in my opinion, is a mixture of both but is also more on the realistic side.

Kate: For me, Stable Diffusion appeared the most complex to find an approach to. Probably it’s just me, but prompting has to be way more in-depth than it is with other models. Like, I had real problems trying to generate a cat with green eyes that would look like a cat, not some diabolic creature.

I played around with all three models and for me, Midjourney stands out tremendously. From what I noticed, it so easily fills in all gaps and background elements with a beautiful mist of small details. Silhouettes, lights, or a dark forest behind the central characters are responsible for creating that mystic fantasy atmosphere. Yes, Midjourney is certainly not meant for minimalist illustrations. You won’t get anything flat, simple, 2D from it. I tried many times and it was nearly OK, but it took me so long to get something like this.

“happy smiling girl with a phone, flat simple illustration, pastel colors” — Midjourney

Has it ever happened that the AI text-to-image model never managed to deliver a satisfactory result? What did you do then?

Kate: You hardly ever get the result you need on the first try. Like, Midjourney only offers 25 free tries and they go faster than you would expect them to! Prompting is the thing you need to master. If the model doesn’t work, although you see everyone else succeed, it’s probably you, sorry. Try as many variations of your prompts as you can or ask someone else to formulate the request for you; different people may focus on completely different parts of something when trying to describe it. By the way, somehow it applies to AI too.

Irina: Why not ask ChatGPT to paraphrase the prompt? Just kidding. Or not.

Kate: Sometimes you don’t know yourself what you want to see as a final result. I seriously doubt everyone else has every idea of the image composition and placement of each glare of light before they enter the prompt.

Did these models affect your day-to-day job and how?

Irina: At first, it was all new, but now that we’ve been using Midjourney for a while, we’re getting used to it. Fundamentally, our job hasn’t changed much. Years in UX/UI design, I’ve been using stock images that I needed to find, select and download before using them in interfaces. The biggest change is the image source. Now it’s not stock websites but text-to-image generation models. They are even more cost-effective business-wise as the Midjourney subscription is cheaper than buying images from artists. Sometimes it takes longer to get to the desired output than find an existing image that would fit, sometimes you get something close enough from a couple of tries. It depends.

Kate: Don’t forget that often enough, the output needs tuning. Different models behave differently if you ask them to add this and that to the bottom left corner. Midjourney is going to regenerate the entire image. Stable Diffusion and DALL-E are more responsive to such changes, they’re more flexible. But sometimes it’s really easier to use Photoshop skills and add a couple of strokes yourself. Not to mention gibberish text and even watermarks you can get in images. Yes, sometimes it happens too!

Irina: Yeah, about tuning. Ad banners for Tensorway were created with Midjourney. Midjourney’s settings are quite limited, especially compared to Stable Diffusion where you can set any image size and even give negative prompts which is basically writing down anything you don’t want to see in the image. But we love Midjourney for the extreme detail and vivid style of images you get. In terms of that, I personally can’t complain.

However, design is based on more than pretty colors and composition, it’s also about adhering to the technical task. Banners rarely have the 1:1 aspect ratio, but neither they have the 2:3 that Midjourney offers. These are the results for asking the model to change the ratio to 2:3.

“person speaking with robot, cute illustration, 8k, spectacular highlights, hdr” — Midjourney

The images rather look flattened whereas I needed them banner-wide, without distorting characters’ proportions, and with more free space on the sides. So I needed to change the aspect ratio myself and somehow fill the emerged space on each side.

DALL-E and Stable Diffusion offer much more when it comes to customizing already generated images. For example, not only can you easily manipulate the size but also give prompts specifically to the added areas. If some element in a generated image is not to your liking, simply use an eraser, and the model will replace the erased area with something else. Stable Diffusion even made a step further — it allows uploading a pre-existing image and modifying this given image.

Kate: For me, these models are a source of inspiration when I can’t quite find out how, for example, an app screen must look. It can really help establish the composition and essentially see how other designers would solve the same problem, as the model is trained on the work of millions of designers! The other day, I stumbled upon designing the onboarding screens for a wellness app. They had to be engaging yet not annoying, you know. And here’s what I got, having prompted something like “nature green application onboarding screen.” It’s like searching for inspiration on Pinterest but getting even more personalized results.

"onboarding screen for a wellness app" — Midjourney

Speaking on practical applications of these models when working with Anadea’s project or Tensorway, we play around with illustrations for landing pages and ad banners most. This is a good opportunity to use something unique and to our requirements. However, you can still recognize AI in many of those images, and I can already see it displacing the Corporate memphis flat illustrations we’re tired from.

Corporate memphis style / ‘Big tech’ illustration style — Google

It’s pretty obvious that AI can be helpful in creating app interfaces. But what about UX? Can AI assist in creating more streamlined experiences?

Irina: A term “UX mining” was introduced not so long ago, the same-name company invented it. They say it’s like traditional data mining but for user experience research. But this, I believe, is a topic for another discussion, since we’re talking about Midjourney and co. here=)

Do you feel like your occupation as a designer is at risk because of AI?

Irina: Right now, where we are, not really. In the end, I’m not an illustrator. These guys are probably going through some rough times, but you better ask them! All three models, depending on the desired style, are great assistants to designers, but they can’t design yet. However, models are learning really fast and I can see how they become better at creating logos every day. One day I may take that back.

Kate: When first cameras appeared, there was much buzz about painters becoming unnecessary. But hundred years later, people still go to museums and admire art. Image generation models don’t substitute people’s work. They complement it instead, giving room for growth.

TL;DR

All models are worth trying. Stick to the one which style you like most. Midjourney suits best for artistic illustration while Stable Diffusion and DALL-E are on the photorealistic side.
Of the three, Stable Diffusion offers the utmost customization settings-wise. Prompts must be as detailed as possible.
Midjourney is the easiest model to work with. You get vivid style images and well-worked-out backgrounds by default.
In most cases, generated images require tuning, if you intend to use them in commercial design. Stable Diffusion and DALL-E allow adding and removing elements in the image while Midjourney will redo the whole image considering new detail.
The discussed use cases apply to the user interface design (UI), because the only thing text-to-image models can help with is image generation.
In 2023, interface design needs human involvement. Models mainly serve as an inspiration and a source of unique illustrations.

Definitions:

Text-to-Image Translation (T2I)

Text-to-image (T2I) translation is a type of artificial intelligence that generates an image based on a written description or a textual prompt.

Generative AI

Generative AI refers to the subset of artificial intelligence focused on creating new content, ranging from images and music to text and more.

Midjourney

Midjourney is a text-to-image AI system that produces unique images from human text input.

DALL-E 3

DALL-E 3 is a new generation of an artificial intelligence model developed to generate images from textual descriptions.

Looking for an AI integration partner?

Get Started with Us

Best AI Text-to-Image Generators: What Designers Think