Generative AI applications are in increasing use due to their wide availability. The latest of them can perform several routine tasks in the field of organisation of classes and data classification. However, their ability to write lyrics, compose music and create digital art has captured the imagination of people worldwide and convinced them to experiment on their own.
GenAI – from text to virtual worlds
The phrase “future is here” has never sounded more true than today
Machine learning is no longer limited to predictive models used to observe, identify and classify patterns in content. Now, thanks to generative AI, it allows, based on natural language prompts, to generate all kinds of new content, such as text, images, video, music or code. It finds many practical applications, such as writing articles, creating product designs or optimising business processes.
GenAI has implications for various industries – from IT organisations that can use the code generated by AI models (essentially correct) to companies needing fast business or marketing content.
The results of generative AI models can be indistinguishable from human-created content. Moreover, these models tend to use random elements, which means they can produce a variety of outputs from a single input request. It makes them even more realistic.
GPT-4 changes the rules of the game
ChatGPT continues to be the most popular generative AI tool. It has been trained on a huge amount of texts, books, articles and code repositories. It answers in complete sentences to prompts and questions in natural language.
GPT – the engine that drives it – has two newest versions, GPT-3 and the more powerful GPT-4, available from the browser and as an API.
Whereas GPT-3 is a text-only model that was not very good at logic puzzles, GPT-4 became a multimodal model with increased collaboration and creativity. It processes not only text commands but also graphic ones; it can pass the exam for a lawyer or a programmer, and it can even sense sarcasm.
GPT-4 focuses on the truth of information but also controllability. It means that it is easier to complete tasks in a way that is precisely defined by the user and produces fewer wrong answers and hallucinations. GPT-4 is also easier to use in everyday IT work.
GPT-3 is a free model, while access to GPT-4 costs $20 per month.
Of course, many other language models (LLM) appeared, e.g. LaMDA and later PaLM LLM, on which the BARD chatbot is based. Microsoft has created Bing AI, which gives a complex answer based on live information. However, most AI text generators use the GPT model, especially the latest version of GPT-4. It makes them all similar and have similar functions.
GAN – the fight between good and evil
For voice recognition, text generation, and, more often, video or image generation, a prediction system based on GAN (Generative Adversarial Network) is used, which works using two (competing) neural networks. Both networks oppose each other to generate synthetic and new data instances, transitioning into real data. They will distribute the data and imitate each other.
The potential of GANs can be used for both evil and good. It is how all creations of fake pictures and fake videos appear. GAN-based video predictions can also help, for example, in detecting anomalies needed in many sectors, such as security and surveillance.
GAN principles of operation:
- understands both the temporal and spatial elements of an image or film;
- based on this knowledge, generate the following sequence;
- distinguish between probable and improbable sequences.
To be able to work with the processes underlying these technologies, we need high-quality data, transparency, complete documentation and AI ethics.
AI generators often work like a magic wand. And, like everything in the wizarding world, magic depends more on the wizard than the instrument.
Generative AI – 5 areas of application
Among many applications of GenAI technology, we’ve selected those used in creative fields, where it will undoubtedly affect the methods and optimisation of the creative process. The use of GenAI is already bringing significant benefits to marketing and design teams, enabling the creation of high-quality content quickly and efficiently.
Large companies such as Adobe, Canva, Microsoft Designer, and Shutterstock have long used GenAI tools to allow users to edit images, create graphics, and generate videos. Many of them are open source, meaning anyone can access and improve existing models or create entirely new ones.
AI is unlikely to supplant humans in terms of creativity because we have one significant advantage: creativity is fundamentally human. Art begins with man’s intention and ends with his choice.
1. Everyday creativity: text generation
How it works?
The current generation of machine learning models for working with text is based on the so-called supervised self-learning. It consists of giving the model a massive amount of text, which allows it to identify patterns and anticipate their new points, e.g., to judge how to finish a sentence from a few words.
However, from a small amount of text, GenAI tools can create high-quality, authoritative text in a matter of seconds – an article, a poem, a story, and then reformulate it to fit the purpose better.
GPT-4 allows the user to co-author and edit creative writing tasks at a higher level. Examples of tasks include composing songs, realistic speech sounds, writing screenplays, or learning a particular writing style through iteration (i.e. until a specific condition is met).
Text generation can strengthen creative teams, contributing to better brand recognition. Using GenAI is becoming common to create dialogues, headlines or advertisements in the marketing, gaming, etc. industries.
These tools are used in chats for real-time communication or for creating product descriptions or social media content.
The function of most AI text generation applications is to access the GPT model through the user interface and be able to control its output. Most of them also offer a text editor where you can edit AI-generated text directly from the app.
- Duolingo – helps you learn languages by providing opportunities for both speaking practice and comprehension development;
- Quizlet – extends learning with Q-Chat, a “personal learning coach”;
- Notion – summarises relevant and valuable information from user notes.
2. Digital art: image generation
How it works?
Thanks to AI image generators, you can quickly generate visual material. The user can transform text into more or less realistic images based on a custom setting, theme, style or location.
An image generator can be helpful at work, e.g. a graphic designer who can instantly create inspiration for his visual materials. Just enter the prompt, and the image created in your imagination will be presented on the screen.
AI-generated images are useful for both hobby and commercial or educational purposes. These tools can help with branding, social media content, vision boards, invitations, flyers, and more.
Several open-source AI image generators are on the market to suit different needs. You can now create your own graphics of objects, animals, abstract shapes, and paintings, but also copy existing ones.
- DALL-E (named after the artist Salvador Dalí and the Pixar robot WALL-E) by OpenAI – it uses a vast image database to create new, unique images based on user-entered descriptions. DALL-E 2 made a sensation thanks to its advanced capabilities and easy access.
- Midjourney (MJ) – despite the high cost of use, MJ is currently one of the most popular and increasingly advanced tools for generating graphics based on text instructions. The complete list of its parameters is available on the Midjourney website (you must download the Discord application before registering).
- Stable Diffusion – works similar to MJ but allows many more options to experiment with settings. It also generates video (more: Interactive experiences: video generation).
Artbreeder, for example, in addition to the function of converting text into a drawing and editing images (portraits, landscapes and various forms of graphics), changing aspects of the face (skin tone, hair and eye colour) or creating collages from images and shapes, has the ability to transform a vision into moving animations.
Actual examples of the use of these tools can be found on the linked pages.
Of course, AI-generated art raises the fundamental question of whether it can still be called “art”, which does not change the fact that it is becoming common and valuable and, simultaneously, a fun way of creating images in a few moments.
3. Interactive experiences: video generation
How it works?
The video generator from text allows to automatically create a video (as well as an image) based on a description provided by a user. The generator analyses a given prompt, recognises keywords, and then creates a video based on them, which can also be edited.
You can get it in the above way:
- feature films and animations;
- short videos for social media;
- advertisements and presentations;
- educational and instructional videos.
Individual video generators can include a set of generative AI tools, such as text-to-video, image-to-image, removing and replacing items, replacing parts of sections or the ability to train your own AI models to generate an image.
Public tools for generating simple videos are provided by, e.g. Runway.
Recently, models for creating high-quality videos have appeared, still in the testing phase. These are:
- Gen-2, the newest model from Runway – a new adaptation of Stable Diffusion AI allows you to create fantastic video content from just text suggestions Gen-2 | Runway;
- Deforum, a new project on GitHub – offers helpful examples and tips on how to create music videos and videos that can transition between text prompts using Stable Diffusion: Stable Diffusion Videos.
Stable Diffusion, a GenAI model developed by Stability AI, has a wide range of applications – from generating detailed images to filling in missing fragments, creating backgrounds, and even creating a picture-to-image transformation from a text prompt. Importantly, Stability AI is open-source, which means it has made the Stable Diffusion source code publicly available. Thanks to this, the latest video generation tools were created on its basis.
Examples of creative video projects using these tools exist on the sites listed above.
4. Sound for all: sound generation
How it works?
Creating complex compositions is quite an expensive endeavour using traditional music production methods. AI makes it much cheaper to produce. GenAI tools can generate audio based on text descriptions. And although this music will not win any Grammy awards, the sound generated from them sounds more and more like something a man can create.
It applies to various aspects of the sound creation process – from music composition through audio mastering to online streaming. Many musicians and record labels seek new ways to integrate AI technology.
However, they still use it as a complementary tool rather than a substitute for artists. Some systems can create songs in the style of selected composers, while others use GenAI to generate completely unique sounds. These tools can be used to design music videos – overlaying sound recordings on video sequences.
- availability of many different types of music;
- the possibility of imitating old and contemporary composers;
- track music over a long time.
MuseNet can generate tracks with up to 10 different instruments and music in 15 different styles, as well as combine different musical types. While still unable to develop your own music, this tool discovers patterns of harmony, rhythm, and style by learning to anticipate.
As part of Magenta, whose plugin works with a classic music creation program, there are tools for generating melodies, harmonies and rhythms that are used in commercial projects, such as advertising jingles or background music in games where time and funds are of great importance, and Music participation is not required.
Several good samples are available from MuseNet on SoundCloud; “High on Life” – a game by Squanch Games, where the music, and partly the graphics and voices of the characters, were generated in the Midjourney AI tool.
Despite the opposition faced by game developers and publishers who started using GenAI to create music and more, prominent games like Unity, Epic Games, Roblox and Ubisoft announced the integration of generative AI in their technology stack.
5. Simplified code: code generation
How it works?
GenAI is a powerful tool for faster mobile and web app development. Generative AI models can be used to generate programming code without manual coding. It is beneficial in the case of so-called boilerplate code, i.e. code that does not perform any interesting functions but needs to be written because it connects other parts of the application.
Thanks to GenAI one can also create new code with natural language prompts and even translate code from one programming language to another. GenAI usage in coding:
- Code completion as typed by programmers saves time and reduces errors, especially in the case of repetitive or tedious tasks.
- Optimise code by reviewing it and suggesting improvements or generating more efficient or easier-to-read implementations.
- Identifying and fixing errors in generated code by analysing code patterns, indicating potential problems and suggesting fixes.
- Automated code refactoring for easy maintenance and updates.
- Code style checking is the analysis of code for compliance with coding style guidelines, ensuring code consistency and readability.
Thanks to GenAI features, coding is possible not only for professionals but also for non-technical people. However, as with any development tool, it is recommended that the code generated in this way is reviewed by someone with a deeper knowledge of the field before the code is incorporated into a production environment.
One of the best examples is Github Copilot, which uses the OpenAI Codex model to offer code suggestions right from the developer’s editor. From GitHub research‘s results, the tool allows developers to speed up coding by up to 96% (for most repetitive tasks) and focus on more critical issues, significantly contributing to job satisfaction (88%).
With the help of Generative AI tools, you can also reduce the cost of building websites – even based on content management systems such as WordPress. Examples include CodeWP – an AI code generator designed and especially trained in WordPress, WooCommerce, PHP, JS and jQuery.
Numerous research statements and surveys show that many companies already use GenAI tools. They generate ideas, create ad texts, analyse data or automate time-consuming and tedious tasks.
If your company wants to use these tools’ potential, we invite you to our YouTube channel, Beyond AI. We discuss topics like AI development and show practical applications of the newest GenAI tools. We assure you that after just a few videos, you will better understand the current AI revolution!