Mixing my own music with AI

Suno AI has an “upload audio” feature, allowing users to upload up to 60 seconds of their own content to be extended by the AI. So earlier this month I had some fun feeding it 45-60 clips of my own music and having the AI write lyrics and turn the clips into choir songs. It’s interesting to hear how the AI uses the melodies, chord progressions, and orchestrations provided in its own creations. The lyrics are a bit amateurish, but serviceable; I was too lazy to write my own. I’m calling the project Hannifin x AI. Here’s the first installment, based on my classic piece “Hour by Hour”; the first 60 seconds are from the original piece, while the rest is AI-generated.

I did the same with 18 other of my pieces. Some things I noticed:

  • The AI works best with simple 8-bar melodies, or 4-bar phrases. It doesn’t seem to “parse” weirder phrase structures very well.
  • It’s not very good at extended the input instrumentally, in my opinion; it quickly starts to sound too repetitive. Having it produce lyrics and turning the music into a song seems to work better. (Melodic repetition seems easier to bear with alternating lyrics.)
  • If you want the AI to generate the voice of a choir, feeding it music from the louder, more energetic and melodic parts of a piece seem to work better, especially if it features a prominent string section. Otherwise you’re more likely to get a soloist, and the music it generates is less likely to sound like a natural continuation of the music you provide.
  • For whatever reason, some tracks just seem to work better than others; maybe it depends on how “familiar” the AI is with the melodic and harmonic patterns? For some tracks, it gave me some pleasant results right away. Other times I had to roll the dice over and over to get something acceptable.

There were some pieces I tried for which I could not get any output that I was happy with, including The King’s AssassinThe Moon Dreamed By, and On the Edge of a Dream. And there was one track, Silver Moon Waltz, for which I got a couple songs I was pleased with. Anyway, I’m done trying for now.

As for the video above, I made it with Blender 4.2, which took a little time figuring out, mostly with various tutorials on YouTube. I’m not completely satisfied with the results. What’s supposed to be “dust” looks perhaps too much like snow and moves a bit too fast, and the text looks a bit weird. Turns out trying to create a sort of “drop shadow” effect on text in Blender is pretty much impossible; I had to sort of fake it with compositing cheats, and I’m not sure I did the best job. (I could’ve just put the text on the background picture and used an image editor to create the drop shadow, but I wanted the animated frequency bars to have it too.) Also, the text might be a bit too bright, but I blame that on the VR display I get with Immersed in the Meta Quest 3.

I’ll upload the other 19 songs I created soon!

 

DALL-E 2 is awesome! I love it!

Warning: Lots of images below!

Earlier this week, I was finally invited to OpenAI’s DALL-E 2 public beta! And I’m completely in love with it. Below are some of my favorite pieces I’ve generated with it by giving it simple text prompts.

First, a few details about the app: Generating these pictures is a computationally intensive process, so they limit how many pictures you can generate. This is done with credits. Upon receiving an invite, they give you 50 free credits to start with. Each credit allows you to send one text prompt, and you get four variations in return. Each month they give you 15 more free credits. However, you can buy credits as well. Currently that price is $15 for 115 credits, which comes to a little over $0.13 per prompt, which really doesn’t sound bad, but it adds up quickly when you get addicted! Still, personally I think it’s totally worth it. Just wish I had more money to spend on it!

Sometimes you get really awesome results, sometimes you get weird abstract nonsense that’s nothing like what you had in mind. So you have to get a feel for what sort of prompts might give you something interesting, and what sort of prompts it won’t understand.

So here’s a little gallery of some of the stuff I’ve created so far. I’ve already spent $30 and it’s only my first week with access, so I will have to restrain myself now. (I still have around 85 credits left.)

Finally, it generates images at a resolution of 1024×1024. I’ve resized the images below in an effort to conserve screen space and bandwidth.

Dolphin eating a cheeseburger

This is similar to a prompt I tried on another AI image generator last year, so I was curious to see how DALL-E would do with the prompt. Much better!

Libraries

My favorite “style” of DALL-E’s output tends to be “oil painting”.

Steampunk owls

Animals wearing headphones

DALL-E tends to draw animals much better than humans, I suppose because they can be a bit more abstract and less structured than a human’s face. (Although note it doesn’t understand that headphones should go on mammals’ ears rather than the sides of their heads, haha.)

Some abstract art

The prompt here was something like “A painting of a giant eye ball sitting in a chair by the fire.”

Portrait of Mozart as various animals

Owls reading books

Painting of Ha Long Bay in Vietnam in the style of Van Gogh

Castles on cliffsides

Starry skies above castles

Flowers growing out of skulls

Money and treasure!

Pirate treasure maps

Skulls on fire

Weaknesses

The above are all cherry-picked examples of some of my favorite outputs so far; some results come out a lot less interesting. DALL-E is particularly not very good with images that require specific structural detail, such as human faces, or pianos, or even dragons. It excels at looser, less-structured forms, such as flowers, trees, and clouds. Below are some examples of output that I was less pleased with, showing some of its weaknesses.

Conclusion

Overall, despite its weaknesses, I’m still completely blown away by the quality of DALL-E’s output. I can’t wait to put some of the images I’ve generated to use as album covers or something! I love it!

AI generated images are getting better!

Last year I posted about creating AI art. The website I mentioned, NightCafe, is still around and has added interesting new features, but the images it generates still primarily lean to the abstract side. It doesn’t generate much I would consider of very practical use beyond having fun.

But just a few weeks ago, OpenAI announced DALL-E 2, and the images it generates are much more mind-blowing and exciting. Here’s a brief overview of the tech from Two Minute Papers:

What a time to be alive!

Granted, the examples shown in the video and on OpenAI’s website are cherrypicked. There are some other examples out there that look a bit more wonky. It still doesn’t seem to be great with human faces, for example, or things requiring a lot of finer details, and it’s awful with generating text in images.

Here’s another video describing the tech:

Despite its weaknesses, it still looks enormously more useful, fun, and exciting than the AI image generators I looked at in that post from last year. I of course added my name to the waitlist. I’d love to experiment with it, but I probably won’t get access anytime soon. But DALL-E 2 definitely looks like something I’d be more than willing to pay for (assuming the price isn’t overly expensive). I can at least imagine creating useful images to accompany blog posts, short stories, book or album covers, or something.

Amazing stuff!

ETA: Also check out this mind-blowing art book of 1,000 robot paintings by DALL-E 2 in various styles: 1111101000 Robots

Fun with AI generated art

Over the past week I’ve been having some fun generating bizarre digital art with AI via a couple of websites. You generate the art by simply giving the AI a text prompt, such as “castles in the sky” and, after a couple of minutes, out pops the AI’s somewhat abstract but interesting interpretation:

Castles in the Sky

Since the results are rather abstract, it helps to use words that lack specific forms, such as clouds and landscapes. If you ask for an animal or human, you’re probably not going to get anything that actually resembles their shape, but rather only some abstract colors and textures that resemble them. For instance, here is “dolphins eating sandwiches”:

Dolphins Eating Sandwiches

It also helps to give the AI some hints as to what the result should look like. For instance, the exact prompt for “castles in the sky” above was actually: “An enormous castle floats in the sky beautiful artwork”. Adding the tags “beautiful artwork” help give it a more painterly look.

The art is also limited in resolution; the AI just takes too much memory for larger pics, so the smaller resolutions are the norm.

I’ve been using two websites to create such art:

  1. NightCafe Studio’s AI Art Generator. The site features a very nice user interface, lets you set some optional settings, and allows you to save and share your work while exploring the works of others. It does make you create an account and limits how much you can create with it using a credit system. You can buy credits or earn some. You can check out my profile here: https://creator.nightcafe.studio/u/Seanthebest
  2. NeuralBlender has no user interface or options, but does not seem to limit use. You do have to wait for the AI to finish its current image before starting a new one if you want to see it in your browser.

I have not yet tried it, but if you do a bit of Googling, you can find resources on how to set up your own AI art generator without having to use one of the websites above; the tech is called VQGAN+CLIP and is available to all. A “GAN” is a generative adversial network … and I have no idea what the other acronyms stand for (obviously you can Google that too). So I’m not sure how long the above websites will stay in service considering the tech is not proprietary, nor do I think the AI produces artwork of enough controllable quality to be of widespread use beyond offering an amusing spectacle.

Still, it’s fun to play with. Here are some of my favorites that I’ve generated so far:

Colorful Clouds

Airship

The Sky Is Cracked

Blue Sky at Night

Stone Palace

Library

Library 2

Library 3

1,055 books…

… are in my personal library, yay!

I had been meaning to digitally catalog my book collection for some time now. I have on several occasions found books at used bookstores that I wasn’t sure whether or not I owned yet (typically books in a series or books by prolific authors). So I finally used a free app called Libib to digitally catalog the books I own (not including eBooks at the moment; I only have perhaps a dozen of those). Next time I am wondering the shelves of a used bookstore, I can now search the app to be sure of what I have and what I don’t. Even while cataloging the books, I found a few books to weed out because I have multiple copies of them.

You can scroll through my library here: https://shannifin.libib.com/

(Unfortunately there does not yet seem to be a way to sort the public listing in any other way besides by title.)

I get a majority of books used, and have walked away with some big loads for cheap prices when stores are going out of business or getting rid of excess. I’m sure I still spend too much money on books considering my slow reading speed, but they’re addicting to collect, aren’t they?

I’ve only read around 10% of these books. Of course, some books are more for reference and not really meant to be read from front to back anyway. Still, with my current reading speed, I will likely die with the majority of these books left unread. Which is fine, because upon death I will have access to infinite knowledge… I hope.

Anyway, if you’re a book lover or collector and wish to digitize a record of your catalog, Libib is the best free app (for Android) I’ve come across so far. It also allows you to export a CSV file, which is handy.

Patreon update and some random stuff

Haven’t blogged in a while, have I? I need to try blogging more frequently, as it’s at least a bit of writing practice while I’m busy plotting.

Quick Patreon update

I’m busy with some other projects, so my music composing (and YouTube video making) has fallen behind. I failed to deliver anything for March to my Patreon supporters. Those pieces are still coming. Anyway, I’ve gone ahead and frozen donations for this month to give myself some time to catch up. I’m hoping I’ll be back at it next month, but I’ll have to wait and see. Regardless, those March pieces are still coming.

A decade of blogging!

I started this blog in April 2007, when I was a junior in college. It’s now been a decade! Yay! Woohoo! And what do I have to show for it? What have I accomplished in that time? Let’s not ask that question, and just consider being around for a decade an accomplishment in and of itself, OK? Yay!

FaceApp

An app called “FaceApp” was recently released for Android, and I’ve had some great fun playing with it, I find it hilarious. I think it’s available for iPhone too. I posted some results to my twitter:

Hours of great fun!

Lyrebird

A new company called Lyrebird is developing some voice synthesis tools, and it sounds pretty awesome! Check out their demos here on their demo page. The voices of Trump, Obama, and Hillary Clinton are still a bit fuzzy to be used for anything other than playing around, but I’m still excited by the potentials. I think it would be awesome to create an audio drama, for example, without having to hire a bunch of different voice actors. There’s also a lot of potential for this sort of technology to be used for music instrument sampling, yes? Especially synthesized choirs. I look forward to seeing this product develop!

How “Bates Motel” should’ve ended

Spoilers ahead. The A&E drama series “Bates Motel”, a modern-day retelling of Hitchcock’s classic film Psycho, ended its five-season run this Monday. The ending left me rather disappointed; it felt too quick and easy. Just unsatisfying. So here’s how I would’ve ended it:

Norman does not kill Romero in the woods. Instead, he tries to kill him, but only injures him, and he runs off with Norman shooting at him. He takes Norma’s body home imagining they are restarting, invites Dylan to dinner, as they did in the episode. Dylan calls Emma, but rather than just hang up, Emma calls that town’s police, afraid for Dylan’s life. Dylan enters the home, sees Norma’s body, vomits in his mouth but swallows it (because I hate seeing characters vomit), but rather than working up Norman into a crazy angry frenzy, actually manages to calm him and perhaps at least half-convinces him that he needs to be in a mental hospital, that Norma will always be with him there or something, or at least he keeps him calm. Romero enters, finding the gun, and points it at Norman, ready to kill. Arriving and hearing a commotion, the police break in, ordering Romero to drop the weapon. Romero refuses and shoots Norman before getting shot himself by the police. Norman dies in his brother’s arms and they have their sad little brotherly moment (without the stupid suicidal “thank you” – there’s nothing beautiful or bittersweet about suicide; I think that’s what annoyed me the most). Slow zoom out with Dylan, dead Norman and Norma, only this time with the police and dead Romero in the background.

That would’ve felt a lot more satisfying to me. The whole mercy-killing thing just felt wrong to me, too sudden and not very climactic.

One more thing…

There was something else I wanted to mention, but now I’ve forgotten it… I’ll blog about it next time, I guess, if I remember…

Hoping to build a computer this year…

Still waiting. You know, for 2016. Because you know what happens in 2016? Presidential election! But, more interestingly, the Oculus Rift will be released! (If everything goes as planned, I guess.)

(Sorry in advance for the materialistic nature of this post. Thinking too much about money and materialistic crap may be harmful to some readers’ souls. Reader discretion is advised.)

I spent some time researching the computer I’ll need to power the Rift, and honestly I’d like to have it ASAP so I can start fooling around with game programming with Unity 5. (As I said before, my current computer runs Unity 4 too slowly, and my OS (Vista) is not even officially supported, so trying to learn Unity on here is a bit torturous.) I’ve used this site as a sort of guide for what I’ll need, so I’m basically looking into building the computer myself, which, for all my interest in computers, will be my first time actually building one from individually purchased parts. Fingers crossed that it’ll go well. Anyway, my plans currently don’t deviate much from the parts listed on the aforementioned site. I’ll probably look into different cases as I’d prefer one with a bit more personality (such as a window), but that’s only if I can find one at a good price with some good space for future upgrades should I want them. Hard drive wise, I’d like to look into perhaps getting both a solid state drive to store the operating system and an old-fashioned mechanical hard drive with 1 or 2 TB’s for some good storage. (Composing music can take up a good amount of space when you’re storing some big audio files, plus games in general can take up some significant chunks themselves.) I’ll have to research how to set that up.

Altogether, my current estimate is that the computer will cost $1,200. Of course, when the time comes, I’ll search around for deals and save every bit I can. I’ll probably also search some nearby stores and see if I can pick up anything in person; having to wait for parts in the mail will be torture for my weak impatient soul (though that will probably be the cheaper option). Anyway, I won’t have to worry about it yet; still gotta save the actual money. (It’s tempting to just use my credit card and buy it all now, but I guess I’ll resist.) With the debts I’m still paying off, my phone bill, and my Netflix addiction, it’ll probably take around three months, give or take. I’m currently about 1/6th of the way there, $200 saved of $1,200. So only $1,000 short.

It’s aggravating having to wait; my mind’s been obsessed with dreaming about VR and a new rig all week. Everything I do feels like something to fill the time while waiting. And while that hasn’t really helped me be more productive in any way, it has actually been a bit cathartic; it’s helped relieve some of my overly-self-conscious “is this a good use of my time?” anxiety that just makes me angry when I feel like I wasted some time, which just makes me waste more time.

I’ve also been looking forward to YouTube’s upcoming game-streaming platform, their answer to Twitch. Maybe I’ll even try streaming some gameplay of my own, though that’ll have to wait until I build that new rig, because I doubt my current Vista-powered computer would stream very well.

I’m also looking forward to the upcoming game for PS4, The Last Guardian, showcased not long ago at E3. I’ve been waiting at least 6 years for this game; it was originally intended as a PS3 game, but it’s been in development for so long that PS4 is now their target console. Check it out:

I don’t have a PS4, but I guess I’ll need to buy one just for this game. Unless I get trapped in my Oculus Rift.

Waiting for the Oculus Rift

It was computer gaming that inspired me to self-learn GW-BASIC programming in elementary school. I wanted to make games. For the past decade, I’ve meandered from that interest, more towards music composition, writing fantasy, and even a couple years studying character animation. I’ve also been interested in 3D images since a young age, when ViewMasters and those Magic Eye books were popular. I’ve loved 3D movies at the theater since they’ve become feasible. I pity those who get motion sickness from the experience, but I’ve never understood any other sort of objections to them. (I, on the other hand, get motion sickness when trying to read in a car, which stinks. I usually have to look out the front window to avoid sickness. Of course, when driving, I have to do that anyway.)

So I’m super excited for the Oculus Rift, the virtual reality gear set to come out sometime near the beginning of 2016. That also gives me time to save up the money to buy it, along with the new computer I’ll need to power it. (And which I need anyway now that my 2009 Alienware laptop is practically useless outside of safe mode thanks to a failing hard drive, and my 2008 Vista-powered desktop is almost out of hard drive space and is outdated a quite a few other regards.) The Rift may get me obsessed with gaming again, and of course I’ll also want to explore developing my own projects with Unity 5. (My current desktop doesn’t even support Unity 5, and Unity 4 runs so slowly that it’s a bit agonizing to use.) The wait is a bit agonizing, but I’m definitely looking forward to it.

Beyond gaming, I can’t help but think about other possibilities the VR gear may make possible. Could I write a novel in a VR world, inspired by fantastical scenery and drowning out real-world distractions? Could I compose music by moving around blocks of notes or something instead of having to click notes into a scoring program? Could I watch a movie (perhaps a 3D one?) in a virtual movie theater so that it’s like watching the movie on a big screen in the distance?

How might VR gear transform websites themselves into virtual experiences? Could I browse books on Amazon as an enormous epic bookstore? Could I make a VR world for my blog?

What about chatting and VR hangouts in virtual worlds?

And then of course there’s Oculus’s Story Studio which I’ve already blogged about that is exploring fascinating possibilities.

I can’t wait to see what awesome new worlds VR technology may make possible!

Movies with Oculus

I thought this was pretty exciting. A little too exciting. So exciting it makes me a bit sick with desire.

I’m definitely saving up for an Oculus dev kit… of course, by the time I can afford it (and the new computer I’ll need to use it), a consumer-oriented Oculus will probably finally be available. Still worth it though.

A movie I’d like to make someday

If you Google around, you can see their are quite a few “personal 3D viewers” available.  They look a bit like a virtual reality set, except they’re for watching movies or playing video games; that is, moving your head around doesn’t do anything.  Personally, I’d love to try watching a 3D film with one of these.  (Not sure I’d use one in public though; I’d rather be aware of my surroundings in public.)  They’re expensive, close to a $1,000, which is a bit out of my price range.

Anyway, wouldn’t it be cool to produce a 3D first-person perspective film to be viewed in one of these 3D viewers?

I know I’m probably not the first to have the idea, but I don’t know of any films produced that are 3D, completely first-person, and designed to be watched with a personal 3D viewer.

I’d also use binaural recording for the sound to really immerse the viewer.  Wouldn’t that be awesome?  Imagine a horror movie produced that way.  Or a newscast.

So… something I’d like to do someday.