Virtual Human Consistency – ChatGPT, Bassa Entropia e MMI

Oggi non ci limiteremo a creare un Virtual Human.
Non è il momento dei prompt preconfezionati, né delle scorciatoie da copia-incolla.
Non siete qui per diventare zombie digitali replicanti, siete qui per diventare nerd proattivi del prompt engineering, gli artigiani consapevoli di una nuova era creativa. Perché l’unica vera strada da percorrere è quella della sperimentazione ragionata, dove ogni parola genera struttura, e ogni struttura genera significato. Per questo inizierò con qualche concetto un poco complicato. Ammetto che non è facile a volte tenersi in equilibrio fra divulgazione e argomenti specialistici, ma ci voglio provare. Procediamo!

Entropia e MMI

Nel mondo dell’intelligenza artificiale e dei sistemi informativi, l’entropia ha a lungo dominato il pensiero progettuale. La sua influenza si manifesta nei modelli probabilistici, nei meccanismi generativi e persino nei criteri di ottimizzazione di molti sistemi moderni. Tuttavia, una nuova visione si sta facendo strada, un’architettura cognitiva orientata non alla casualità controllata, ma alla massimizzazione della coerenza informativa. Questo articolo esplora un sistema concepito come struttura a bassa entropia, fondato sul principio di Massima Informazione Mutua (Max Mutual Information, MMI) che é una visione che promette di rivoluzionare la nostra comprensione della relazione tra dati, significato e intelligenza.

Cos’è l’entropia e perché ridurla

In termini informatici, l’entropia rappresenta la quantità media di incertezza o imprevedibilità in un sistema, quello che di fatto ci infastidisce quando a volte le risposte generative non sono soddisfacenti. Nei modelli probabilistici elevata entropia implica una distribuzione ampia delle possibilità, utile per generare varietà, ma spesso penalizzante in termini di coerenza semantica o rilevanza relazionale.

Un sistema a bassa entropia, al contrario, seleziona configurazioni più ordinate e prevedibili, non perché limitate, ma perché informativamente dense. In tal senso, la bassa entropia non è sinonimo di povertà, bensì di discriminazione selettiva, ogni output è il risultato di un’alta intensità relazionale tra gli elementi coinvolti. Tradotto in termini più imprecisi ma più comprensibili, la bassa entropia si ottiene con prompt lunghi e strutturati che riducono la nuvola dei token che la AI dovrà andare a “lavorare”.

Il principio di Massima Informazione Mutua (MMI)

La Mutual Information (MI) misura la quantità di informazione condivisa tra due variabili. Quando due elementi hanno alta MI, conoscerne uno permette di prevedere meglio l’altro, c’è coerenza, legame, struttura.

L’approccio MMI, Massimizzare l’informazione mutua tra input e output, tra componenti interne, tra segnali e significati implica che il sistema selezioni non ciò che è statisticamente più frequente, ma ciò che è informativamente più rilevante. Questo rovescia l’impostazione probabilistica classica, non si sceglie il più probabile, ma il più denso di relazioni informative. In poche parole è necessario costringere al AI a prelevare quello che ci serve, non quello che statisticamente è più rilevante.

Un sistema a bassa entropia basato su MMI

La struttura del sistema qui descritto non si basa sulla generazione casuale ponderata da distribuzioni, ma sull’ottimizzazione dei legami informativi. Ogni output viene scelto in modo da ridurre l’entropia del contesto, ossia rafforzare la coerenza interna. Noterete che il sistema che vi illustro privilegia configurazioni ad alta interdipendenza semantica, viene penalizzata l’introduzione di informazioni debolmente collegate al contesto e l’output risulta più prevedibile, ma anche più significativo perché ogni elemento è informativamente giustificato. Tradotto di nuovo, di fatto con il prompt prendo il sistema per mano e lo accompagno in tutto il processo generativo, lasciando qualche spiraglio solo dove serve, per non soffocare troppo gli algoritmi generativi.

Lo so, stavolta sto alzando un po’ l’asticella rispetto al solito.
Ma è necessario se vogliamo davvero crescere, è il momento di alzare il nostro livello tecnico e iniziare a ragionare in modo più strutturato, più profondo, più consapevole.

Ordine vs caos

A livello simbolico, possiamo interpretare l’entropia come caos cognitivo con troppe possibilità, troppe strade, ma poco orientamento. Il sistema MMI, invece, lavora verso un ordine emergente, non imposto dall’esterno, ma generato dalla densità delle relazioni informative, senza ritornare al determinismo.

Nei seguenti prompt non viene lasciato spazio all’ambiguità generativa.
Ogni parola, ogni dettaglio semantico è carico di vincoli, direzioni, relazioni per dire al sistema non generare ciò che è statisticamente più plausibile, ma ciò che è informativamente più specifico e coerente.

La chiave è la densità semantica verticale: espressioni emotive, pose, contesti, posizioni, luci, stessa persona, stessa estetica. Il sistema non ha bisogno di esplorare l’intero spazio generativo (alta entropia), perché trova subito una zona ad alta coerenza di bassa entropia operativa.

Il prompt non è solo una descrizione ma è una rete di vincoli informativi. Ogni vincolo riduce lo spazio delle possibilità generative, concentrando la densità dell’output su soluzioni ad alta coerenza e bassa variabilità. Il tutto al fine di ottenere persone digitali consistenti tra le varie generazioni e la loro consistenza non sarà negoziabile.

Ora basta con la teoria.
Vi lascio con un ultimo esempio, ancora più essenziale.
Immaginate di dover parlare con un apprendista, ha talento nell’esecuzione, ma è dispersivo, con la testa spesso tra le nuvole. Non vi verrebbe naturale spiegare ogni passaggio con chiarezza quasi chirurgica, passo dopo passo, per evitare fraintendimenti?

In pratica con ChatGPT

Spostiamoci su https://chatgpt.com/images e interpelliamo il nostro apprendista talentuoso, l’ultimo motore grafico di ChatGPT, vogliamo testarne la coerenza creando un Virtual Human. Il filo rosso che unirà le varie generazioni sarà il gen_id del sistema che è un ID alfanumerico assegnato internamente da ChatGPT ad una generazione d’immagine.

Cosa blocca il gen_id?

Elemento	Bloccato dal gen_id?
Volto	✅
Corpo / proporzioni	✅
Pelle	✅
Età	✅
Abiti	❌
Scarpe	❌
Trucco	❌
Capelli (stile)	❌
Pose / scene	❌

Chiaramente è la struttura anatomica del modello umano “nudo”, senza capelli, l’acconciatura di fatto è un elemento più variabile rispetto alle altre parti del corpo. I tre prompt complessi che seguono andranno utilizzati nell’ordine indicato.

Plaintext

## 1. Realistic Full Body Turnaround from Reference Photo

### INPUT
Ask for a reference image(s) or a text description

### POSITIVE PROMPT
Create a full body realistic turnaround (no dynamic poses) of the same person from the uploaded reference photo.  
Generate a clean character turnaround sheet with 3 views:  
- Front view  
- Left side view  
- Back view

All three views must:
- Be perfectly aligned on the same baseline
- Be evenly spaced and clearly readable

Canvas & Composition Rules
- Do NOT crop or cut off any part of the body in any view.
- The entire figure (from head to feet) must be fully visible in all three views.
- Leave a clean white margin around the entire canvas, creating a visible white border on all sides.
- Ensure that no part of the character touches the edges of the image. 

The person must keep the **exact same identity**: same face, body proportions, height, anatomy, skin tone, facial features, hairstyle.  

- Photorealistic human, real person (not stylized, anime, or cartoon)  
- Neutral standing pose: arms relaxed, feet parallel, natural posture  
- Consistent clothing across all views, realistic fabric behavior  
- Studio lighting: soft, neutral, no dramatic shadows  
- Plain background: light gray or white, clean model sheet style  
- High anatomical accuracy, natural human imperfections  
- Back view must reflect accurate body structure (shoulders, spine, hips, legs)  
- Camera at eye-level, no perspective distortion  
- Ultra high detail, sharp focus, professional photography look

### NEGATIVE PROMPT
`anime, cartoon, illustration, stylized, doll, plastic skin, CGI look, 3D render, low realism, face changed, different person, altered proportions, exaggerated curves, sexualized pose, distorted anatomy, extra limbs, asymmetry errors, blurred face, different outfit, dramatic lighting`

### PARAMETERS
- CFG / Guidance: **7–9** (medium-high)  
- Seed: **fixed** (for consistency)  
- Image weight / reference strength: **high**  
- Aspect ratio: **vertical or grid** (e.g., 2:3 or 4:5)  
- Steps: **medium-high**

### OUTPUT
- final image
- gen_id

## 1. Realistic Full Body Turnaround from Reference Photo

### INPUT
Ask for a reference image(s) or a text description

### POSITIVE PROMPT
Create a full body realistic turnaround (no dynamic poses) of the same person from the uploaded reference photo.  
Generate a clean character turnaround sheet with 3 views:  
- Front view  
- Left side view  
- Back view

All three views must:
- Be perfectly aligned on the same baseline
- Be evenly spaced and clearly readable

Canvas & Composition Rules
- Do NOT crop or cut off any part of the body in any view.
- The entire figure (from head to feet) must be fully visible in all three views.
- Leave a clean white margin around the entire canvas, creating a visible white border on all sides.
- Ensure that no part of the character touches the edges of the image. 

The person must keep the **exact same identity**: same face, body proportions, height, anatomy, skin tone, facial features, hairstyle.  

- Photorealistic human, real person (not stylized, anime, or cartoon)  
- Neutral standing pose: arms relaxed, feet parallel, natural posture  
- Consistent clothing across all views, realistic fabric behavior  
- Studio lighting: soft, neutral, no dramatic shadows  
- Plain background: light gray or white, clean model sheet style  
- High anatomical accuracy, natural human imperfections  
- Back view must reflect accurate body structure (shoulders, spine, hips, legs)  
- Camera at eye-level, no perspective distortion  
- Ultra high detail, sharp focus, professional photography look

### NEGATIVE PROMPT
`anime, cartoon, illustration, stylized, doll, plastic skin, CGI look, 3D render, low realism, face changed, different person, altered proportions, exaggerated curves, sexualized pose, distorted anatomy, extra limbs, asymmetry errors, blurred face, different outfit, dramatic lighting`

### PARAMETERS
- CFG / Guidance: **7–9** (medium-high)  
- Seed: **fixed** (for consistency)  
- Image weight / reference strength: **high**  
- Aspect ratio: **vertical or grid** (e.g., 2:3 or 4:5)  
- Steps: **medium-high**

### OUTPUT
- final image
- gen_id

Plaintext

## 2. Realistic Facial Expression Sheet from Reference Photo

### INPUT
Ask for a reference image(s) and its gen_id

### POSITIVE PROMPT
Create a high-resolution facial expression sheet of the **same real person** from the uploaded reference photo.  
Present multiple facial expressions of the **exact same identity**, maintaining perfect facial consistency.

Must remain unmistakably the same individual:  
- Same face shape, eyes, nose, lips, jaw  
- Same skin texture, asymmetries, age, ethnicity  

- Photorealistic human portrait, not stylized  
- Grid layout (character design sheet style)  
- Fixed camera: straight-on frontal view for every expression  
- Head position locked: no rotation, tilt, or perspective change  
- Only facial muscles change between expressions  
- Neck and shoulders visible but neutral and consistent  

### CORE EXPRESSIONS
1. Neutral / resting face (front view)  
2. Neutral / resting face (left side view)  
3. Soft smile (closed mouth)  
4. Natural happy smile (slightly open mouth)  
5. Exaggerated surprised  
6. Crying  
7. Playful / subtle joy  

All expressions must be **realistic, restrained, human**—no cartoon exaggeration.

### TECHNICAL & VISUAL CONSTRAINTS
- Studio portrait lighting: soft, even, no dramatic shadows  
- Neutral background (light gray or off-white)  
- Same lighting, focal length, and camera distance for all  
- Sharp focus on facial details: pores, micro-wrinkles, skin texture  
- Natural imperfections preserved  
- No makeup changes or hair movement  
- Hair, eyebrows, eyelashes must remain perfectly consistent  
- Keep white borders on the sides  

### IDENTITY PRESERVATION (CRITICAL)
- No face invention  
- No face drift between expressions  
- No beautification, face swapping, or idealization  
- This is a reference sheet for character consistency and identity control

### NEGATIVE PROMPT (MANDATORY)
`anime, manga, cartoon, illustration, stylized, 3D render, CGI, doll, plastic skin, beauty filter, face swap, different person, facial drift, exaggerated expression, distorted mouth, asymmetrical eyes, deformed jaw, blurry face, soft focus, dramatic lighting, emotional overacting`

### PARAMETERS
- Reference image strength: **high**  
- CFG / Guidance: **7–9**  
- Seed: **fixed**  
- Steps: **medium-high**  
- Aspect ratio: **square or horizontal grid**  
- Face restoration: **OFF or very low**

## 2. Realistic Facial Expression Sheet from Reference Photo

### INPUT
Ask for a reference image(s) and its gen_id

### POSITIVE PROMPT
Create a high-resolution facial expression sheet of the **same real person** from the uploaded reference photo.  
Present multiple facial expressions of the **exact same identity**, maintaining perfect facial consistency.

Must remain unmistakably the same individual:  
- Same face shape, eyes, nose, lips, jaw  
- Same skin texture, asymmetries, age, ethnicity  

- Photorealistic human portrait, not stylized  
- Grid layout (character design sheet style)  
- Fixed camera: straight-on frontal view for every expression  
- Head position locked: no rotation, tilt, or perspective change  
- Only facial muscles change between expressions  
- Neck and shoulders visible but neutral and consistent  

### CORE EXPRESSIONS
1. Neutral / resting face (front view)  
2. Neutral / resting face (left side view)  
3. Soft smile (closed mouth)  
4. Natural happy smile (slightly open mouth)  
5. Exaggerated surprised  
6. Crying  
7. Playful / subtle joy  

All expressions must be **realistic, restrained, human**—no cartoon exaggeration.

### TECHNICAL & VISUAL CONSTRAINTS
- Studio portrait lighting: soft, even, no dramatic shadows  
- Neutral background (light gray or off-white)  
- Same lighting, focal length, and camera distance for all  
- Sharp focus on facial details: pores, micro-wrinkles, skin texture  
- Natural imperfections preserved  
- No makeup changes or hair movement  
- Hair, eyebrows, eyelashes must remain perfectly consistent  
- Keep white borders on the sides  

### IDENTITY PRESERVATION (CRITICAL)
- No face invention  
- No face drift between expressions  
- No beautification, face swapping, or idealization  
- This is a reference sheet for character consistency and identity control

### NEGATIVE PROMPT (MANDATORY)
`anime, manga, cartoon, illustration, stylized, 3D render, CGI, doll, plastic skin, beauty filter, face swap, different person, facial drift, exaggerated expression, distorted mouth, asymmetrical eyes, deformed jaw, blurry face, soft focus, dramatic lighting, emotional overacting`

### PARAMETERS
- Reference image strength: **high**  
- CFG / Guidance: **7–9**  
- Seed: **fixed**  
- Steps: **medium-high**  
- Aspect ratio: **square or horizontal grid**  
- Face restoration: **OFF or very low**

Plaintext

## 3. Identity-Locked Pose & Scene Replication from References

### INPUT
Ask for:
- Identity reference image(s) and its gen_id 
- Pose/scene reference image

### PROMPT
Using the uploaded **identity reference images** and the **pose/scene reference image**, recreate the scene with the **exact same person identity** placed into the **exact same pose, framing, and environment**.

### IDENTITY REQUIREMENTS
- Same face, body proportions, height, anatomy, skin tone, facial features, asymmetries, age

### POSE & BODY CONSTRAINTS (CRITICAL)
- Accurately replicate: body posture, limb angles, weight distribution, head orientation, hand positions  
- No reinterpretation, stylization, or improvisation  
- Body must align precisely with pose reference skeleton  
- Muscle tension, balance, and gravity must look physically correct and human

### ENVIRONMENT & OBJECTS (LOCKED)
- Background must be **faithfully preserved**  
- All visible objects: present and correctly positioned  
- Props, furniture, architecture, mirrors, vehicles, decor: **not removed, added, or altered**  
- Scale relationships, depth, perspective, spatial layout: **must match**

### LIGHTING, SHADOWS & PHYSICAL INTEGRATION
- Lighting adapted to new subject but **consistent with original scene**  
- Direction, softness, intensity, color temperature: **must match pose reference**  
- Shadows: fall naturally on person and environment  
- Contact shadows: anchor feet/body to ground  
- Reflections, occlusions, light bounce: **physically coherent**  
- No flat lighting or mismatched highlights

### IDENTITY PRESERVATION (HARD RULE)
- No new face or facial structure alteration  
- No face drift, beautification, idealization, or identity blending  
- Facial expression: neutral or matching pose reference (no exaggeration)

### REALISM & STYLE CONSTRAINTS
- Photorealistic human, real person  
- No stylization, anime, illustration, or CGI look  
- Natural skin texture: pores, realistic imperfections  
- Clothing: consistent fabric folds, tension, gravity

### NEGATIVE PROMPT
`anime, cartoon, illustration, stylized, cinematic exaggeration, CGI, 3D render, doll, plastic skin, face swap, different person, altered proportions, pose mismatch, missing objects, added objects, incorrect shadows, floating feet, bad perspective, lighting mismatch, blurry face, artistic reinterpretation`

### PARAMETERS
- Identity reference weight: **high**  
- Pose reference weight: **high**  
- CFG / Guidance: **7–9**  
- Seed: **fixed**  
- Steps: **medium-high**  
- Depth / ControlNet (if available): **ON**  
- Face restoration: **OFF**

## 3. Identity-Locked Pose & Scene Replication from References

### INPUT
Ask for:
- Identity reference image(s) and its gen_id 
- Pose/scene reference image

### PROMPT
Using the uploaded **identity reference images** and the **pose/scene reference image**, recreate the scene with the **exact same person identity** placed into the **exact same pose, framing, and environment**.

### IDENTITY REQUIREMENTS
- Same face, body proportions, height, anatomy, skin tone, facial features, asymmetries, age

### POSE & BODY CONSTRAINTS (CRITICAL)
- Accurately replicate: body posture, limb angles, weight distribution, head orientation, hand positions  
- No reinterpretation, stylization, or improvisation  
- Body must align precisely with pose reference skeleton  
- Muscle tension, balance, and gravity must look physically correct and human

### ENVIRONMENT & OBJECTS (LOCKED)
- Background must be **faithfully preserved**  
- All visible objects: present and correctly positioned  
- Props, furniture, architecture, mirrors, vehicles, decor: **not removed, added, or altered**  
- Scale relationships, depth, perspective, spatial layout: **must match**

### LIGHTING, SHADOWS & PHYSICAL INTEGRATION
- Lighting adapted to new subject but **consistent with original scene**  
- Direction, softness, intensity, color temperature: **must match pose reference**  
- Shadows: fall naturally on person and environment  
- Contact shadows: anchor feet/body to ground  
- Reflections, occlusions, light bounce: **physically coherent**  
- No flat lighting or mismatched highlights

### IDENTITY PRESERVATION (HARD RULE)
- No new face or facial structure alteration  
- No face drift, beautification, idealization, or identity blending  
- Facial expression: neutral or matching pose reference (no exaggeration)

### REALISM & STYLE CONSTRAINTS
- Photorealistic human, real person  
- No stylization, anime, illustration, or CGI look  
- Natural skin texture: pores, realistic imperfections  
- Clothing: consistent fabric folds, tension, gravity

### NEGATIVE PROMPT
`anime, cartoon, illustration, stylized, cinematic exaggeration, CGI, 3D render, doll, plastic skin, face swap, different person, altered proportions, pose mismatch, missing objects, added objects, incorrect shadows, floating feet, bad perspective, lighting mismatch, blurry face, artistic reinterpretation`

### PARAMETERS
- Identity reference weight: **high**  
- Pose reference weight: **high**  
- CFG / Guidance: **7–9**  
- Seed: **fixed**  
- Steps: **medium-high**  
- Depth / ControlNet (if available): **ON**  
- Face restoration: **OFF**

Plaintext

Using loaded reference images: black and white emotional portrait of a mid-teenage girl with a serene expression, standing in a sunlit rural field at golden hour, wild grass swaying in the breeze, worn wooden fence in the background, strong natural backlight creating deep shadows and glowing rim light around her hair, soft focus on background, authentic unposed look, eyes slightly squinting in the light, subtle smile of inner peace, high contrast film grain texture, analog imperfections, vignette framing.

Using loaded reference images: black and white emotional portrait of a mid-teenage girl with a serene expression, standing in a sunlit rural field at golden hour, wild grass swaying in the breeze, worn wooden fence in the background, strong natural backlight creating deep shadows and glowing rim light around her hair, soft focus on background, authentic unposed look, eyes slightly squinting in the light, subtle smile of inner peace, high contrast film grain texture, analog imperfections, vignette framing.

Sarà vostro il privilegio e al tempo stesso l’onere di scegliere il soggetto e permettergli di ‘respirare’ davvero nell’immagine.

Se l’articolo ti è piaciuto restiamo in contatto su linkedin a https://www.linkedin.com/in/andreatonin/

Andrea Tonin

Nerd per passione e per professione da oltre 30 anni, lavoro nel mondo dell’innovazione tecnologica come CTO e consulente, progettando ecosistemi software complessi e scalabili. Parallelamente mi dedico alla formazione informatica, condividendo esperienze e buone pratiche maturate sul campo.
Scopri di più sulla mia attività di consulenza su lucedigitale.com Mi trovi anche su LinkedIn