DALL-E 2 is awesome! I love it!

Warning: Lots of images below!

Earlier this week, I was finally invited to OpenAI’s DALL-E 2 public beta! And I’m completely in love with it. Below are some of my favorite pieces I’ve generated with it by giving it simple text prompts.

First, a few details about the app: Generating these pictures is a computationally intensive process, so they limit how many pictures you can generate. This is done with credits. Upon receiving an invite, they give you 50 free credits to start with. Each credit allows you to send one text prompt, and you get four variations in return. Each month they give you 15 more free credits. However, you can buy credits as well. Currently that price is $15 for 115 credits, which comes to a little over $0.13 per prompt, which really doesn’t sound bad, but it adds up quickly when you get addicted! Still, personally I think it’s totally worth it. Just wish I had more money to spend on it!

Sometimes you get really awesome results, sometimes you get weird abstract nonsense that’s nothing like what you had in mind. So you have to get a feel for what sort of prompts might give you something interesting, and what sort of prompts it won’t understand.

So here’s a little gallery of some of the stuff I’ve created so far. I’ve already spent $30 and it’s only my first week with access, so I will have to restrain myself now. (I still have around 85 credits left.)

Finally, it generates images at a resolution of 1024×1024. I’ve resized the images below in an effort to conserve screen space and bandwidth.

Dolphin eating a cheeseburger

This is similar to a prompt I tried on another AI image generator last year, so I was curious to see how DALL-E would do with the prompt. Much better!

Libraries

My favorite “style” of DALL-E’s output tends to be “oil painting”.

Steampunk owls

Animals wearing headphones

DALL-E tends to draw animals much better than humans, I suppose because they can be a bit more abstract and less structured than a human’s face. (Although note it doesn’t understand that headphones should go on mammals’ ears rather than the sides of their heads, haha.)

Some abstract art

The prompt here was something like “A painting of a giant eye ball sitting in a chair by the fire.”

Portrait of Mozart as various animals

Owls reading books

Painting of Ha Long Bay in Vietnam in the style of Van Gogh

Castles on cliffsides

Starry skies above castles

Flowers growing out of skulls

Money and treasure!

Pirate treasure maps

Skulls on fire

Weaknesses

The above are all cherry-picked examples of some of my favorite outputs so far; some results come out a lot less interesting. DALL-E is particularly not very good with images that require specific structural detail, such as human faces, or pianos, or even dragons. It excels at looser, less-structured forms, such as flowers, trees, and clouds. Below are some examples of output that I was less pleased with, showing some of its weaknesses.

Conclusion

Overall, despite its weaknesses, I’m still completely blown away by the quality of DALL-E’s output. I can’t wait to put some of the images I’ve generated to use as album covers or something! I love it!

TuneSage progress update 7

The time is flying by too quickly! But I am making progress. The backend melody-generating code is working much better now, though it’s actually only writing four-bar phrases at the moment. So I’ll be working on expanding that capability for the rest of the week, as well as expanding its stylistic palette with more training data. (A melody is just a collection of related phrases, so the foundation is already there.) If I’m lucky, I may even be able to share some example output next week.

Frontend-wise, I think the only other feature I need to work on for now is the ability to add, move, and delete tempos, which should only take a couple of hours. The frontend it still needs a design overhaul, though, which will take another day or two.

The frontend will be missing a lot of features on launch, but users should at least be able to generate tunes and export them as MIDI files to open in their favorite DAW or notation program or whatever.

So I think my schedule is close to the same as it was in my last progress update:

  • This week: Finish backend and redesign frontend
  • Next week: Soundfont and user account system, start releasing samples
  • Week 3: Register company, install payment and analytics systems
  • Week 4: Set up trial, front page update, and launch!

If I can actually accomplish that, I could launch as soon as August 15!

But of course that’s probably not going to happen…

Still, we’re getting closer and closer!

TuneSage progress update 6

My goal last week was to “finish backend and overhaul frontend”, which definitely did not happen.

My work on the backend unfortunately came to another dead end. I was trying automate the training of the AI so that I could just give it melodies and it would train on them with little oversight. It worked, but too inefficiently; it must be continually tweaked to work well, so the whole endeavor ends up taking even more time. For now, it seems it will be more efficient time-wise to train it manually. In other words, I think it will be more time-efficient to use supervised learning rather than unsupervised learning. (Which is perhaps an obvious outcome, but it was worth a try.)

So I’m already weeks behind! Schedule now looks like this:

  • This week and next week: Finish backend and overhaul frontend
  • Week 3: Soundfont and user account system, start releasing samples
  • Week 4: Register company, install payment and analyctics systems
  • Week 5: Set up trial, stress testing, front page update, and launch!

Admittedly, it will still likely take longer than that…

Questions about startup idea:

The last startup school webinar was about evaluating your idea for a startup, and included a number of good questions to ask yourself about an idea to help with that. Granted, I’ve already chosen an idea (AI music SaaS), but I thought it still might be interesting to answer the questions:

  1. Does your team have founder / market fit to work on this idea? I don’t really have a team, but yes, as a programmer and a music composer, I think I have good founder / market fit. The product is something I want for myself and would use.
  2. How big is the market for this idea today? I don’t know. There are other AI music services out there, but I don’t know how good their profits are. The market for music software in general, however, is huge.
  3. How big could it be in a few years? Again, I don’t know, but I haven’t seen any indication that it’s growing rapidly at the moment.
  4. What is the problem you hope this product will solve? Have you seen this problem first hand? How are confident are you that it’s actually a problem? For your users, how acute and frequent is the problem? Composing music can be time consuming; there are lots of creative decisions to make. It can also be difficult to get going if you’re just getting started or haven’t done it in a while. Yes, I have experienced this first hand. I know others have this problem as well, as my previous melody generating apps attracted some users. As for how frequent and acute the problem is, I don’t know; I’ll have to talk to more users.
  5. Do you have entrenched competition? If so, how will you beat them? Yes, there are a few competitors in AI music. I need to do some more research on them, but in my opinion, they’re of limited use, particularly because they do not generate interesting melodies. They’re output tends to sound either too random or too bland. My focus on melody may be a good starting point to beat them.
  6. Is this something you personally want and would use? Definitely!
  7. Did this idea only recently become possible, or only recently become necessary? Yes and no. The algorithms would certainly be possible to run on older computers, but they take time to come up with.
    1. If not, why has no one solved it before? The algorithms are not obvious, I suppose, even with popular modern AI paradigms.
  8. What are the proxies – large, successful companies that do something similar to this? I don’t know if there are any. None that I know of, anyway.
  9. Is this a problem that you personally care about? Is it something that you would be willing to work on for a long time? Yes and yes.
  10. Can your solution scale? Could this be a consulting business in disguise? Since it’s an SaaS, yes, scaling is possible.
  11. Is this idea in a good “idea space”? I think? I’m not really sure what the “idea space” for AI music is.
  12. How did you come up with this idea? Did you start with the problem or the solution? Started with the problem. In fact, don’t even have solutions to all the problems yet! But I think the solutions are reachable.
  13. Do you have a new insight about this idea, one that few others have? Yes, I think my approach to generating melodies is a new insight; I don’t see any other service offering decent melody generation at the moment.
  14. What are the current alternatives that people use instead of your product? Why will people switch to your product? How difficult will it be to get them to switch? I admittedly don’t know; I’ll have to talk to users.
  15. How will you make money? SaaS!
  16. If this the kind of business that has a chicken-and-egg problem (i.e., a marketplace, a dating site), how will you solve it? No chicken-egg problem here!