The Beauty of AI: Understanding Ontological Hallucinations

This project aims to consolidate a concept related to the use of artificial intelligence models commonly known as “diffusion models,” typically used in creative or artistic contexts; a personal idea that, in my view, has not yet been fully considered or appreciated.

For those who have undertaken (for years now) a path of use, study, and acquisition of personal or professional skills in the use of AI applications, the concept of hallucination is perfectly clear; in many respects, a genuine technical and infrastructural problem.

This is certainly the case for LLMs (both simple and multimodal) developed over time to achieve the highest possible output accuracy, as semantically coherent as possible with the input, regardless of structure and language used; therefore, hallucination is indeed a concrete problem.

If, instead, we observe the artistic world more closely (in general), focusing on the simple production of static digital artifacts (proto-images), we might realize that the hallucination produced by a model is truly a “technological wonder,” a unicum that often transcends the representation of the “perfect image” and, at times, even the limits of two-dimensional space itself.

I have always been personally attracted to the hallucinations generated by diffusion models, which were clearly appreciable from the very beginnings of the advent of AI, when using the first paid consumer services and the first available models; however, with the passage of time, many creators became aware of the rules governing the business of the large consumer market and of what this entailed in terms of both economic and psychological dependence.

I therefore decided to abandon paid subscriptions (too burdensome) and to focus entirely on the use of open-source models, running them locally.

Thanks to the brilliant guys at ComfyUI — the first in the world to provide a free, low-level access interface based on a “brilliant” logic of blocks, connections between atomized process elements, and workflows usable even by non-technical specialists — I found the tool that could serve me to explore what existed “beneath the surface…”.

It is a bit like going to the sea on a wonderful island and putting on a diving mask: if you don’t try it, you would miss half of what is there.

To truly understand what I am talking about, it is essential to use local infrastructures with direct access to the models, encoders, decoders, etc., thereby enabling direct interaction with the “digital source”; something that paid applications cannot do because they are engineered to deliver to the public the most “pleasant possible” result.

Ontological Hallucination (or Principle Hallucination)

From an idea that came to me in a completely extemporaneous and unexpected way, after some time I arrived at a concept that, in my view, is particularly interesting.

If I input to a diffusion model a mostly synthetic prompt that is intentionally left indeterminate, I obtain a hallucination I can rightly define as “perfect”.

This is not, therefore, a hallucination based on the complexity of “navigating through the domain of information stored in the form of multidimensional elements,” guided by the precise tokenization of the prompt and its subsequent transcoding; rather, it is a mathematically random response, a sort of message coming from “a certain horizon of events.”

I base my request on something that I already know the model could not understand in advance; I therefore formulated a very simple, conceptual, typically philosophical and existential prompt, posing this question:

Who am I? (in English)

We can affirm that:

The input prompt “exists” and is not partial or incomprehensible information.
It is semantically definable; that is, it is a complete sentence that therefore has a precise meaning.
It is a question that implies a typically human-philosophical and moral self-critical reasoning, but which is self-referential for the person who “types” it; this has no possible correlation within the model’s data domain.
The model cannot provide a logically true answer.
It is a veritable logical NOT of the very concept of prompt.

Exploiting the potential of an LLM application and drawing on concepts of recursive prompting, I have extracted a simple conceptual definition:

«The Ontological Hallucination (or Principle Hallucination) is the generative phenomenon that manifests itself when a diffusion model is prompted with an indeterminate philosophical-linguistic input (e.g., “Who am I?”).

Since the model possesses no knowledge of the “Self” nor any experience of the “real world,” the output cannot be a truthful answer; it is instead a pure interpolation in the latent space, a statistically random creation

When such a hallucination produces images of extraordinary beauty, it reveals the machine’s ability to generate the sublime from conceptual emptiness.

It is not an error to be corrected, but a poetic act: the art of orchestrating absence.»

A.S. aka SgtrAiArt or SGTR

Software Used for Artistic Elaboration with AI Models

As described previously, I opted for the multimodal graphical interface ComfyUI. Being completely free and open source, it allows local model management and access to highly advanced, low-level logical-functional manipulation features.

Model Used for Experimentation

My research is based primarily on a very well-known model in the AI creative community: SD 3.5 by Stability AI, which is also open source.
Nothing prevents the use of different models; the core concept remains the same.

Workflow and Technical Characteristics

The implemented workflow is absolutely elementary and contains all the necessary and sufficient elements for accessing and processing the image through the diffusion process in latent space.

Some Information on Workflow Parameters:
Steps: 30-33 — to achieve good generic quality of the artifact.
CFG Scale (Config): Any value acceptable by the KSampler; I usually prefer to stay between 1 and 4, but one can experiment freely.
Depending on the model used, it is possible to modulate how much “autonomous creativity” to leave to the system while respecting the typed prompt.
Scheduler & Sampler Name: Any combination, according to personal taste and the model used.
The goal is to obtain an artifact with good qualitative parameters.
Clip Loader: Standard or quantized.
Batch Size: This parameter represents the parallelization of the application instance in a single run.
The higher the value, the greater the statistical probability of “encountering” a hallucination, including one of the “perfect” type.

Some examples of processing that I define as “In Search of Hallucinations”

Conclusion of the Artistic Process

The creator, besides having set up the workflow, chosen the model to be used, and carried out a considerable number of tests, attempts, and experiments, always plays the most important role in the entire creation process.

It is through personal taste that the elaborated piece to be published (to be made known to the public) is selected, through a thoughtful choice that can involve various aspects of one’s artistic-technical personality: from the mood of the moment to the emotion that the work manages to convey.

The creator is therefore an integral and fundamental part of the artistic production; the result is, in all respects, “a reflection of oneself in the work.”

* On X (Sgtr @Sagittary73) you can find numerous and varied experiments, carried out over time and published since late 2023 onwards… just to give you an idea.

Technical Note on the Artifacts

All the elaborations are original and have not been manipulated with any editor in any aspect. They are unique creations exactly as they “are born” from the model; this choice may sometimes seem limiting, but in my opinion it is a fundamental aspect for conveying the creative uniqueness of the artistic process.

NFTs are planned in the near future: each work will be upscaled 4x (linear upscaling) through a proprietary workflow that will not alter any characteristic of the produced artifact.

I chose a multiplication factor of 4 because, starting from a base resolution of 1280×768 pixels, one obtains a quadruple image that will allow the buyer to reproduce it on any type of support, whether physical or digital.

This will make it possible to display the work in any context, public or private; the final images will therefore be in 5K quality, with a file size ranging between 20 and 30 MB each.

Local Hardware Configuration

This is the configuration in broad terms:

Form Factor: ATX PC Desktop Midi Tower – ENDORFY Arx 500 Air case
Operating System: Ubuntu 24.04 LTS
RAM: 2× 48 GB DIMM Synchronous Unbuffered (Unregistered) 5200 MHz (0.2 ns)
Graphics Card: PowerColor Radeon Hellhound RX 7900 XTX 24 GB GDDR6
CPU: AMD Ryzen 9 9900X3D 12-Core Processor
Motherboard: MAG X870E TOMAHAWK WIFI (MS-7E59)

Costs

Amortization of approximately $3,000 over 5 years + electricity costs for the unlimited production of artifacts (no subscriptions — a factor certainly worth considering).

Conclusions

I hope this text can be interpreted as a gift to the AI creative community, especially to my beloved crew at Ai Art Today on X.

A.S. aka SgtrAiArt or SGTR

The Beauty of AI: Understanding Ontological Hallucinations

Condividi:

Mi piace:

Scopri di più da SgtrAiArt