How I Improved the GPT2 Output Detector | by Michael | GoPenAI


AI Summary Hide AI Generated Summary

Improvements to GPT-2 Output Detector

This article describes enhancements to the GPT-2 Output Detector, a tool for identifying AI-generated text. The original detector was limited to 510 tokens, hindering its effectiveness with longer texts. The author improved the frontend to process large texts, support custom file uploads, and enable parallel processing of multiple files. This makes the detector more useful and accessible.

Addressing Limitations

The original limitation stemmed from the API's 510-token limit. This meant the detector was sensitive to small changes at the beginning of long posts and unsuitable for longer social media content. The goal of the improvements was to allow the detector to analyze the entire content holistically.

Implementation Details

The article provides a link to the improved application and a brief demonstration. It refers to a previous post where the author tested the accuracy of the original GPT-2 Output Detector and found it effectively distinguishes between human-written and AI-generated text. The improvements largely focused on enhancing the user interface and processing capabilities to overcome the token limitation.

Sign in to unlock more AI features Sign in with Google

How I Improved the GPT2 Output Detector

The GPT2 Output Detector is a powerful classifier that detects ChatGPT text with an impressive accuracy. I extended the detector’s frontend to work for large texts, support custom file uploads, and allow parallel processing of many files.

These new features make the GPT2 detector more usable and accessible. In this blog post I will describe my changes and how they work.

You can try out the application at: https://michaelgithubhype.github.io/Extended-GPT2-Output-Detector/index.html

Here is a full demo:

Background

This project is based on my previous post where I tested the accuracy of the GPT-2 Output Detector. I found that the tool is accurate — it effectively detects the difference between human written vs ai-generated text.

However, there is an issue. The GPT2 Output Detector is limited to 510 tokens. For long posts, the tool is sensitive to small changes at the beginning. This restricts content detection support to social platforms with short texts. Ideally, the tool would take a holistic view of the content.

The problem with the GPT-2 Output Detector is that the API limits the amount of allowed tokens. Here is the response format once a user sends more than 510 tokens:

Was this article displayed correctly? Not happy with what you see?

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

Try our Chrome extension today!


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device