Over the years, significant time and resources have been dedicated to improving data quality in survey research. While the quality of open-ended responses plays a key role in evaluating the validity of each participant, manually reviewing each response is a time-consuming task that has proven challenging to automate.
Although some automated tools can identify inappropriate content like gibberish or profanity, the real challenge lies in assessing the overall relevance of the answer. Generative AI, with its contextual understanding and user-friendly nature, presents researchers with the opportunity to automate this arduous response-cleaning process.
Harnessing the Power of Generative AI
Generative AI, to the rescue! The process of assessing the contextual relevance of open-ended responses can easily be automated in Google Sheets by building a customized VERIFY_RESPONSE() formula.
This formula integrates with the OpenAI Chat completion API, allowing us to receive a quality assessment of the open-ends along with a corresponding reason for rejection. We can help the model learn and generate a more accurate assessment by providing training data that contains examples of good and bad open-ended responses.
As a result, it becomes possible to assess hundreds of open-ended responses within minutes, achieving reasonable accuracy at a minimal cost.
Best Practices for Optimal Results
While generative AI offers impressive capabilities, it ultimately relies on the guidance and training provided by humans. In the end, AI models are only as effective as the prompts we give them and the data on which we train them.
By implementing the following ACTIVE principle, you can develop a tool that reflects your thinking and expertise as a researcher, while entrusting the AI to handle the heavy lifting.
Adaptability
To help maintain effectiveness and accuracy, you should regularly update and retrain the model as new patterns in the data emerge. For example, if a recent world or local event leads people to respond differently, you should add new open-ended responses to the training data to account for these changes.
Confidentiality
To address concerns about data handling once it has been processed by a generative pre-trained transformer (GPT), be sure to use generic open-ended questions designed solely for quality assessment purposes. This minimizes the risk of exposing your client’s confidential or sensitive information.
Tuning
When introducing new audiences, such as different countries or generations, it’s important to carefully monitor the model’s performance; you cannot assume that everyone will respond similarly. By incorporating new open-ended responses into the training data, you can enhance the model’s performance in specific contexts.
Integration with other quality checks
By integrating AI-powered quality assessment with other traditional quality control measures, you can mitigate the risk of erroneously excluding valid participants. It’s always a good idea to disqualify participants based on multiple quality checks rather than relying solely on a single criterion, whether AI-related or not.
Validation
Given that humans are generally more forgiving than machines, reviewing the responses dismissed by the model can help prevent valid participant rejection. If the model rejects a significant number of participants, you can purposely include poorly-written open-ended responses in the training data to introduce more lenient assessment criteria.
Efficiency
Building a repository of commonly-used open-ended questions across multiple surveys reduces the need to train the model from scratch each time. This has the potential to enhance overall efficiency and productivity.
Human Thinking Meets AI Scalability
The success of generative AI in assessing open-ended responses hinges on the quality of prompts and the expertise of researchers who curate the training data.
While generative AI will not completely replace humans, it serves as a valuable tool for automating and streamlining the assessment of open-ended responses, resulting in significant time and cost savings.