Hi all,
As i was kindly requested to explain the creative process involved in my kind of graphic novel named "Broozel City Guide" posts i wrote a couple of days ago, here we go for an article that will dive into some general resources and basic examples around generative AI.
Generative AI
I use digital generative tools (VQGAN+CLIP) to create the pictures and some text generation tools (mainly GPT-J) to produce some sentences or ideas. Like some other digital artists, it's a kind a dynamic exchange between randomness and my own cultural background that occurs here more than just copy/pasting what a computer might output. The objective is not to produce massively but to use them like new instruments. Think about the first time we used 'sampling' in music in early 80's .
Access to the tools
NightCafe
I heard people using https://creator.nightcafe.studio/ which offers text-to-transformers online picture creation. Very easy to use but this is not free.
Dalle mini
Currently, a limited online demo will give you a direct feeling of such technique via the great Huggingface community. They setup this website https://huggingface.co/spaces/flax-community/dalle-mini with limitation on image size.
Google Colab
The step further requires a bit of knowledge and learning to use Google Colab which needs a gmail account.
You will find plenty of tutorials on the net. In short, you will remotely use a computer of Google, and let run some programs in a pre-configured AI environment template providing all the resources to create your images or your text.
If this works pretty well, you can be easily face to error messages that may frustrate you depending on your technical skills to sort it out.
Take this one as entry point for learning curve.
Then you may try this VQGAN+CLIP script
https://colab.research.google.com/drive/1ZAus_gn2RhTZWzOWUpPERNC0Q8OhZRTZ
Or this one
https://colab.research.google.com/github/justinjohn0306/VQGAN-CLIP/blob/main/VQGAN%2BCLIP%28Updated%29.ipynb?authuser=3#scrollTo=g7EDme5RYCrt
Or explore other colabs like in this article
https://towardsdatascience.com/12-colab-notebooks-that-matter-e14ce1e3bdd0
and this one i just found
https://ljvmiranda921.github.io/notebook/2021/08/11/vqgan-list/
Finally, if you want to run it locally, you should definetly master some skills, setting up a quite powerful machine with Graphics cards. Read here some related notes : https://pythonrepo.com/repo/nerdyrodent-VQGAN-CLIP-python-deep-learning
Using the tools :
I can't draw, frustrated since chilhood, but i like staring at drawings and all kind of visuals. From now, i don't have to draw anything to get a wild creative assitant that will draw something for me, like the little prince asking for sheep.
Draw me a sheep in Dall-e Mini
Prompting
You should know that the computer does not consciensouly draw a sheep, program aims to predict the pictural representation of the wording you give it as input to start its generative processing. This is call "prompting".
Crafting this written prompt is a creative activity in itself in order to essentially influence the results provided by this virtual assitant. As a lot of randomization is also involved, you are engaged in experimenting this oniric conversational chemistry between invoking your magical written invocation and the result of your curse that will catalyze your own imagination.
Like alchemists and cooking chefs, users are sharing their magical wordings like receipts giving stunning results. But the Verb is vast as the universe and the initial database of those engines covers a large part of our world artwork heritage. With a certain bias for sure. But like a young Harry Potter, you may try some receipts, just varying some of your own wording and get majesctic pictures.
Example
For example, try the dall-e with these examples, and see from yourself how far we are from my basic 'sheep' example.
Prompt:
Strangely Beautiful still Dusty Sunbeams sunspots in Half destroyed log wood cabin by James Gurney, Mark Maggiori, Thomas Moran | Environment Interior Keyframe Concept Art Trending on ArtStation | Cinematic Atmospheric Keyframe known for its composition
Let vary the beginning of the sentence by:
Vast Medieval Banquet in the Cathedral of Fulcanelli by James Gurney, Mark Maggiori, Thomas Moran | Environment Interior Keyframe Concept Art Trending on ArtStation | Cinematic Atmospheric Keyframe known for its composition
As you see, all these keywords, will weigh in the way the engine make its predictions.
Additionally, a set of parameter and other processing can be applied, like making its own model, tuning colors, sizing, and other expert stuff in development. The possibilites are infinite.
Text authoring with GPT-J
For the text and the creation of Broozel, i often get an idea from staring at the surprising picture the AI can produce after one of my bizarre prompts. To activate such idea, i sometimes use generative text language models.
For example, i had the idea of a musical city with different places that would have musical background namings. I had already written some ideas but i wanted more shuflling, i wanted to extent the brainstorming... So even if i'm convinced of my own creativity, i thought it would be cool to try those models, known to be quite good at this kind of game.
Altough plenty of websites are now selling such services, i quite satisfied by using the demo of GPT-J, a small gpt-3 based AI, which is reachable at https://6b.eleuther.ai/
GPT-J producing text from my prompt in bold, used as inspiration
And similarly, but quite faster and practical, https://bellard.org/textsynth/
Afterwards, like new instruments, it's a matter to play with all the knobs and get the sound that fits your intuitive vibration, staying connected with the response given by the machine and your creative goals. That's where this little story telling was taking form step by step, like a lego composition inspired by my own cultural references, from Italo Calvino to the architects of Art Nouveau, the writings of Borges and Heraclites, and music lover.
So Don't hesitate be creative, write your own dreams.