This is not just about the content we publish, it’s also about how we create it. Here we share our approach, the guiding questions behind our work, and the insights […]
This is not just about the content we publish, it’s also about how we create it.
Here we share our approach, the guiding questions behind our work, and the insights we’ve gained through experimenting with AI-generated articles.
MAIA methodology for creating articles
MAIA project aims to resolve one major issue in science popularisation: large amount of work required to process the underlying knowledge, summarise it and formulate the main messages in a form and language suitable for intended audience. With this in mind, we have developed a set of "MAIA knowledge tools" facilitating discovery, ingestion and GenAI - analysis of documents and used them, among other, to develop a series of articles that were finally published on Climate Change Mitigation Portal in September 2025.
These articles are based on video presentations of different Climate Change adaptation and mitigation projects that were previously facilitated by the MAIA team. To generate them, the MAIA SummarAIse service suite (specifically, the SumQA service) was used to extract information from webinar transcripts. In the first step, the project team has experimented with different AI models and different ways to define "prompts" - questions that GenAI is expected to resolve by analysing the underlying document(s). These questions reflected the structured content of the article by asking, for each project or initiative, the following:
- Full name and project acronym
- Key targets of the project and a short introduction
- Keywords that represent the project or initiative
- Key climate risks that the project addresses
- Method of the project, if possible, given stepwise
- Key outputs of the project and how are these communicated or presented.
- Unique selling proposition of the project or initiative (distinguishing features or innovative elements)
Following the experimentation phase, where different AI models, different formulations of prompts (questions) and different formulations of the system prompt (instructions to the AI core concerning the way how it should resolve these questions) were tested, the service was instructed to batch-process all transcripts, resulting in a set of Question/answer pairs summarising the contents of all videos.
In the final step, these answer sets were reuploaded to the SumQA platform and leveraged to generate complete articles. At this stage, several outputs were generated by different AI models and the human experts utilised these outputs to generate a final version of the article. Overall, the procedure engaged with AI models to automate answers and articles for efficient delivery whilst maintaining manual reviews to ensure reliability of the articles.
Explore the tool here: Public Test Instance
Amplify your reach: MAIA Multiply
Lessons Learnt from Using AI Tools for Data Extraction and Summarization
Advantages:
Efficiency of automated results- One of the most notable benefits of SumQA is that by automating the extraction of information and delivering answers all at once, it speeds up research and data management. As it allows for quick insights without manually sifting through large volumes of data.
Comparative transparency- Another strength lies in the transparency by comparing outputs from multiple AI models. This approach helps address ethical concerns such as AI bias and censorship by enabling users to evaluate and contrast responses, rather than relying on a single model. Consequently, it promotes a more controlled and trustworthy use of AI-generated content.
Furthermore, the flexibility in configuring system prompts such as specifying narrative voice, output format, section structure, tone, and keyword categorization, helps users to tailor outputs. Ensuring that the outcomes are relevant, whether for experts or general audiences.
The three factors of efficiency of automated results, comparative transparency, and tailored inputs cumulatively are invaluable for summarization tasks when detailed and non-conflicting instructions are provided. It enables curators to review summarized outputs and remain aware of potential biases, without having to review the original documents in detail, thereby supporting their productivity.
Challenges:
However, similar to other AI tools SumQA faced challenges including AI hallucination where models generate inaccurate or fabricated information. Currently, the tools lack mechanisms to quantify the extent of hallucination in their responses, making it difficult to assess the reliability of the answers provided.
Inconsistencies in formatting are common, such as the failure to exclude reference numbers when requested or irregular inclusion of subheadings for specified sections like challenges, methods, and expected outcomes. Additionally, some models, like Eurollm, have performed poorly, often ignoring instructions and referring to incorrect source material.
It also faces the risk of overlapping article sections, especially when detailed descriptions of expected outcomes or unique selling points are requested. A major challenge is the inconsistent adherence to instructions regarding length, inclusion of key terms and specific details, and avoiding vague language. This inconsistency ultimately impacts the clarity and usefulness of the generated reports.
More specifically, in contrast to other AI models, Eurollm and Gemma 2 have more difficulties following instructions as they become more detailed. As noticed as they more frequently produce irrelevant or incomplete answers.
Recommendations:
To overcome ignored instructions its encouraged to state stricter, aka more detailed systems prompts and question designs. By specifying format, style, section headings, and a clear emphasis on relying exclusively on input sources to minimize hallucination. Moreover, when asking for a specific subsection its encouraged to follow with a short description, and then an example.
From experimentation, its discerned that machine-readable formats like JSON for system prompts and questions can facilitate instruction adherence. In addition to JSON, prompt engineering with the best responding AI models (such as GPT and Llama) is suggested as these factors delivered most reliable/expected outputs. Most importantly, introducing quality control processes—whether expert human review or AI-based supervision—is needed to ensure outputs meet standards before publishing.
Other more technical recommendations include, fine-tuning of AI models or supplementing them with additional tools is encouraged to boost performance and consistency. Investing in hardware upgrades, such as more powerful graphics cards, will enable faster training, larger batch processing, and handling of more complex data sets. In hopes of developing AI systems that can manage hallucinations, data completeness, and contextually relevant outputs- particularly critical for the targeted focus areas of climate change mitigation and adaptation.
Overall, SumQA is a useful tool that allows for efficient, transparent, and expert-driven extraction and summarization of projects. However, to fully realize its potential, key challenges must be addressed through prompt engineering, quality control, and model optimization. These improvements can ultimately facilitate more effective and accurate AI-assisted research and knowledge generation.
MAIA
MAIA creates, connects, and supports communities, services and tools to turn EU-funded climate research into actionable insights and commercially viable products, services and IP. When you join the MAIA community, you get access to an interconnected suite of tools and services.Project details
- Project title: “Maximising impact and accessibility of european climate research” (MAIA)
- Funding scheme: European Union Horizon Europe Programme (EU Europe, grant agreement no. 101056935)
- Duration: 3. years (1 September 2022 – 31 August 2025)
- Project coordinator:BC3 Basque Centre for Climate Change<, Spain/li>
- Project website: https://maia-project.eu
Beyond MAIA
Enabling the knowledge curators to develop the procedure for knowledge extraction that can be later applied to a large number of input documents is the key to increasing their productivity and facilitating the sustainability of the knowledge platforms that aim to present the results of latest research and innovation to different stakeholders. MAIA SummarAIse was developed specifically for this purpose and its development will continue beyond the end of the MAIA project, facilitated by NEB Junction project. Likewise, the "Knowledge community" that was initiated by MAIA will be further maintained independently from EU projects by AIT. Stakeholders interested in the topic of Climate Change knowledge production and knowledge services are kindly asked to join this community by indicating the interest in "Network of Climate Adaptation and Mitigation Knowledge Platforms" through "MAIA Multiply" sign up form.MAIA project team
This article was created by the MAIA project team using the MAIA Knowledge Toolkit” most notably the SumQA service.Below is a selection of sources and tools that supported the creation of this article, including the original data it is based on, relevant project links, and the AI tool used in the writing process:
- MAIA project website
- MAIA Webinar Library
- MAIA Knowledge Toolkit
- Behind Our Posts: Method, Questions, and Insights
Explore the tool here: Public Test Instance
Amplify your reach: MAIA Multiply