Senior Project Advisor

Brian Hutchinson

Document Type

Project

Publication Date

Spring 2023

Keywords

Large Language Models, LLM, LLMs, Economics, ML Evaluation, ML, AI

Abstract

This paper describes a novel dataset, EconQA, constructed to assess the performance of large language models within multiple choice economics questions. I present results from 10 experiments, varying prompts and model choices. Results challenge previous findings that prompt choice makes a large impact on quality of response. Using the GPT 3.5 Turbo model, observed performance levels ranged from 70-77% for all prompt choices, with the no prompt baseline scoring 73%. When prompted to use Chain-of-Thought reasoning with examples, performance was highest at 76%. Contrary to previous research, performance on mathematical questions when prompted with Chain-of-Thought was high. This paper closes with an analysis of the types of questions the models performed best on and common errors.

Department

Computer Science

Recommended Citation

Van Patten, Tate, "Evaluating Domain Specific LLM Performance Within Economics Using the Novel EconQA Dataset" (2023). WWU Honors College Senior Projects. 657.
https://cedar.wwu.edu/wwu_honors/657

Subjects - Topical (LCSH)

Natural language processing (Computer science); Artificial intelligence; Text data mining; Economics

Type

Text

Rights

Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.

Language

English

Format

application/pdf

Download

Included in

Computer Sciences Commons

COinS

Western CEDAR

WWU Honors College Senior Projects

Evaluating Domain Specific LLM Performance Within Economics Using the Novel EconQA Dataset

Senior Project Advisor

Document Type

Publication Date

Keywords

Abstract

Department

Recommended Citation

Subjects - Topical (LCSH)

Type

Rights

Language

Format

Included in

Browse

Search

Contributors

Links

Western CEDAR

WWU Honors College Senior Projects

Evaluating Domain Specific LLM Performance Within Economics Using the Novel EconQA Dataset

Authors

Senior Project Advisor

Document Type

Publication Date

Keywords

Abstract

Department

Recommended Citation

Subjects - Topical (LCSH)

Type

Rights

Language

Format

Included in

Share

Browse

Search

Contributors

Links