ArchiJury

Interpretive model to evaluate architectural design competition entries

Project

ArchiJury is a research project that investigates how vision-language models (VLMs) can be trained to interpret, critique, and score architectural designs by drawing on expert knowledge and domain-specific evaluative logic. Rather than emulating the stylistic fluency of generic AI models, ArchiJury aims to approximate the reflective reasoning of human architectural juries—producing structured critique that is both qualitative and quantitative, grounded in architectural theory and context.

Rethinking Critique through AI

In architecture, critique is rarely objective. It is a situated, layered, and interpretive process that blends disciplinary vocabulary with intuition, experience, and cultural literacy. However, real-world critique—whether in juried studios or design competitions—often suffers from inconsistency, limited feedback time, and subjective bias. ArchiJury addresses this tension by asking:

"Can an AI model offer coherent, context-aware architectural evaluations that mirror expert reasoning?"

To do so, the project employs a domain-specific training methodology that combines image-caption pairs with architectural evaluation criteria. While large language models often hallucinate due to lack of domain specificity, ArchiJury minimizes this risk by building upon a curated dataset composed of architectural images and expert-authored captions.

Technical Foundations

The model architecture builds on SmolVLM, a vision-language model optimized for low-latency inference and fine-tuned using 12,320 human-authored captions derived from expert jury prompts. Hyperparameters such as Top-k sampling, Top-p nucleus sampling, and temperature tuning were adjusted to balance diversity and precision in critique generation.

Dataset curation involved three stages:

  1. Visual scraping from verified architectural project platforms

  2. Manual filtering to remove infographics and non-perspectival drawings

  3. Expert annotation using a rubric of 10 critique dimensions

Model Evolution: From Classification to Contextual Critique

The project progressed through three major development phases—each refining both the model’s task structure and the depth of architectural reasoning it could emulate.

In version 1 (v1), the model was trained on a binary-labeled dataset consisting of 839 organic and 750 synthetic images. These labels focused on visual and formal features such as geometry type, color strategy, and composition logic. This phase was primarily aimed at feature recognition and tagging. The images, some of which were generated using prompts like “angular pavilion with polychromatic facade”, served to expand the data scope and controlled variety in form.


However, version 2 (v2) marked a fundamental shift. Here, the model was tasked with producing comprehensive architectural critiques rather than mere classifications. A new dataset of 1,232 real architectural images was compiled through rigorous scraping and filtering.

Each image was annotated with 10 textual critiques written in the style of architectural juries, based on prompts such as:

“As an architect, write a paragraph evaluating this building’s form, context, construction strategy, and architectural principles.”


These critiques were structured according to evaluative dimensions such as scale, form, style, structural logic, environmental sensitivity, and contextual integration—mirroring the dimensions found in design jury rubrics:


Below each output, the generated critique is directly shown, demonstrating how the model interprets urban context, facade articulation, spatial organization, and material usage.


This striking public building combines modern transparency with an organic design ethos, evident in its undulating roofline that mimics natural forms while framing the expansive glass façade below. The structure's scale and open interior suggest a civic or institutional function, likely a library or cultural center, where light and openness foster community interaction and knowledge sharing. The rhythmic arrangement of slender columns enhances verticality while maintaining an elegant balance and unity with the horizontal roof profile. The use of glass maximizes natural light and visual connection between the interior and exterior, emphasizing transparency and accessibility. The landscaped surroundings integrate seamlessly with the architecture, softening the formal grid of the plaza and creating a welcoming approach. The design’s sustainable intent is evident in the thoughtful interaction with natural light and potential energy efficiency measures. Overall, this project masterfully blends form, function, and environmental sensitivity into a contemporary architectural landmark.

These outputs were validated and iteratively improved by domain experts to ensure architectural coherence.


Scoring Architecture: Toward Measurable Critique

In version 2.1, ArchiJury introduces quantitative scoring, translating qualitative observations into weighted scores across six architectural criteria: form and originality, context and urban relation, structural strategy, density and scale, sustainability, and compositional principles.


The scoring process employs a combination of sentence-transformer models and weight-tuned scoring functions. Each architectural image receives a full critique paragraph along with a normalized score (e.g., 71.57 / 100). This dual-format output allows ArchiJury to serve both formative and summative roles: feedback for learning, and grading for evaluation.


To ensure interpretability, Grad-CAM visualizations and heatmaps were explored as potential scoring explainability tools—linking sections of the architectural image to critique elements. While still experimental, this step moves toward a future where AI-based scoring is not only accurate, but also justifiable and visually legible.


Conclusion and Vision

Rather than replacing human juries, ArchiJury envisions AI as an interpretive collaborator—a tool that can scale up thoughtful critique, assist novice designers, and bring consistency to the architectural evaluation process. It opens new possibilities for integrating machine reasoning into the culture of design education and review, especially in data-rich, feedback-poor contexts.


📘 Publication

Çiçek, S., Aksu, M. S., Öztürk, E., Bingöl, K., Mersin, G., Koç, M., Akmaz, O. K., & Başarır, L. (2025). ArchiJury: Exploring the Capabilities of Vision-Language Models to Generate Architectural Critique. Journal of Computational Design, 6(1), 165–190.
https://doi.org/10.53710/jcode.1618548


Get in touch to transforming artificial intelligence methods and techniques into real-life applications.

United Methods of Artificial Intelligence Lab

Get in touch to transforming artificial intelligence methods and techniques into real-life applications.

United Methods of Artificial Intelligence Lab

Get in touch to transforming artificial intelligence methods and techniques into real-life applications.

United Methods of Artificial Intelligence Lab