GeoAI Insights Dashboard
Welcome to the GeoAI Insights Dashboard. This platform, developed as part of a GIS program capstone project at COGS NSCC, offers a comprehensive suite of tools to evaluate and compare the performance of various Large Language Models (LLMs). The primary focus is on the application of LLMs in geospatial analysis and the ability of LLMs to assist users in performing complex database queries using natural language. Explore detailed metrics, visualizations, and rankings to understand model strengths, weaknesses, and overall suitability for specific GeoAI tasks. Navigate through sections dedicated to core performance, specialized SQL evaluations, and comparative model rankings to gain actionable insights.
Section 1: Core Performance Metrics
Structured Response Reliability
Evaluates the model's ability to follow instructions and return a well-formed JSON object as requested. Success in this context means the model provided a proper JSON response, indicating reliability in structured data output.
Geospatial Layer Picking Accuracy
Analyze the accuracy of models in selecting the correct geospatial layers for given tasks. This summary shows correctness rates, including how often models chose the right layers, missed required layers, or selected extra, unnecessary layers.
Combined Performance Overview
Get a combined overview of model performance, integrating both the general return status and the layer selection correctness. This provides a holistic view of model capability across different evaluation criteria.
Section 2: Specialized Evaluations
Section 3: Model Rankings
Core Performance Rankings
See how models rank based on overall success rates or specific correctness metrics for general tasks. Helps identify top-performing models.
Specialized Evaluations Rankings
Explore model rankings based on their performance in SQL generation and evaluation tasks. Focuses specifically on SQL-related correctness and efficiency.