Going Beyond Text: Enhancing Large Language Models with Spatial Predictors

Large Language Models (LLMs) have disrupted many aspects of our everyday life. They assist in generating useful information and excel at producing large chunks of human-like information in the blink of an eye. Therefore, it's no surprise that many people take LLM outputs for granted, being impressed by large chunks of generated text that look reasonable.

However, in a nutshell, these tools are nothing more than the next token predictor. The generated information consists of the most probable units that are learned from the large datasets. On the other side, the most valuable information is often encapsulated in output that is rational yet improbable. LLMs are good at generating output based on existing patterns but often fall short when pushed to the limits of the known, especially in specialized or complex tasks.

Retrieval Augmented Generation often referred to as RAG partially solves the problem. It adds specific information to the context of LLM, which leads to incomparably higher accuracy in reasoning over the specific topic.

What if the data we input is difficult to reason about, such as an image or highly structured data like a graph? What if we need to reason about the future dynamics of this data or perform calculations on the input? For these specific reasons, the challenge with using general-purpose Large Language Models (LLMs) for decision-making is that they are too generic and fail to account for the specific attributes of the data, even when used in an LLM context. When using LLMs in geospatial analysis is considered, slight changes in land use evolution over time can not be predicted solely using general-purpose LLMs. Therefore, as part of the POLIRURAL+ project, GeoAI Advancements aim to integrate the precision of spatial predictors with the versatility of LLMs.

In the system we are developing, users are able to select specific parts of a map by pinpointing coordinates of interest. Upon the user's query and selection, an API call will be automatically triggered to retrieve relevant geospatial data for that area. The data will then be processed by the spatial predictor, a specialized model trained on historical geospatial trends to forecast precise future developments, such as urban growth, land use changes, or environmental shifts.

To trigger a specific API we propose the agentic approach in which LLM is designed to act as autonomous conversational agents, capable of managing tasks like answering questions, performing actions (e.g., booking appointments), and learning from their interactions with users to refine their responses or strategies over time. Therefore the model with an agentic approach is not only reactive to input but also proactive, goal-driven, and capable of learning and adapting through their interactions with users or environments. This approach enhances the autonomy and usefulness of AI systems by allowing them to function more like intelligent agents with decision-making capacities.

Compared to a basic RAG system, where context information is searched in a datastore and then included with the question, an agent can obtain even more information by using external tools that are not constrained solely to predicting and generating text. For example, tools can perform mathematical calculations, access data beyond the datastore through web searches, or further process retrieved data by passing it to another model, such as the predictors mentioned earlier. These tools also use task-specific APIs, among other functions. The actions an agent takes are not predetermined; the agent decides whether it needs to use tools and specifies which tools are necessary based on its understanding of what tools are available and the nature of the question. This provides the system with greater flexibility in finding answers to more complex or specific tasks.

For instance, if you want the LLM to provide a summary of monthly temperatures for a specific area, and this information is not present in the datastore, a basic RAG model would struggle to provide the correct answer because it wouldn’t find relevant information during its context retrieval. In contrast, an agent can determine that answering the question requires temperature data. Knowing it has access to an API capable of supplying such data, it can make a call to this API, include the obtained temperature data in the LLM's context, and guide the system toward the correct answer.

The agent can be configured to meet specific needs by defining conditions and setting criteria that guide its actions. For example, paths can be defined for the agent to follow after using a tool to ensure correct behavior, or certain tools can be restricted to specific scenarios. The combination of LLMs and external tools enables the agent to produce more accurate and context-rich outputs, particularly for tasks that require up-to-date or domain-specific knowledge.

These spatial predictors, in combination with the agentic approach, offer more specific and accurate insights compared to general-purpose LLMs. While LLMs are versatile and generate rich, context-based responses, spatial predictors, trained on historical datasets and tailored to specific use cases, will focus on niche areas like geographic patterns and predictions. Moreover, such models are not only reliable but extremely lightweight especially when compared to LLMs, therefore when we scale up the usage a large amount of resources can be saved as LLM usage cost is proportional to the number of generated tokens.

Once the spatial predictor processes the geospatial data, its output will then be used as context for the LLM. This integration will, therefore not only save resource usage with higher precision but also transform the technical results into user-friendly explanations, providing the end user with a clear understanding of the future trends or developments in the selected geographic area.

For instance, land use evolution can depend on various factors, such as the surrounding area, climate, transport connections, and available resources. While metadata from maps used as LLM context can assist with basic area analysis, more complex, non-linear predictions require a specialized tool trained to analyze environmental and infrastructural factors.

In addition to land use, this tool can be applied for analysis and dynamic prediction across various data sources, including weather forecasting, environmental shifts, tourism potential, regional investment opportunities, and more. Forecast a region’s evolution, enabling users to make informed, data-driven decisions leading to an efficient solution for geospatial planning.

An example of how LLMs work with integrated tools

Going Beyond Text: Enhancing Large Language Models with Spatial Predictors

Related

Unveiling Rural Innovation: Insights from TC Freyung Meeting

Work on the future concept of a Czech-German innovation hub

PoliRuralPlus: Integrating Technology and Nature for Sustainable Rural Development

Existing Comments

Post your comment