How I Improved the GPT2 Output Detector | by Michael | GoPenAI

AI Summary Hide AI Generated Summary

Improvements to GPT-2 Output Detector

This article describes enhancements to the GPT-2 Output Detector, a tool for identifying AI-generated text. The original detector was limited to 510 tokens, hindering its effectiveness with longer texts. The author improved the frontend to process large texts, support custom file uploads, and enable parallel processing of multiple files. This makes the detector more useful and accessible.

Addressing Limitations

The original limitation stemmed from the API's 510-token limit. This meant the detector was sensitive to small changes at the beginning of long posts and unsuitable for longer social media content. The goal of the improvements was to allow the detector to analyze the entire content holistically.

Implementation Details

The article provides a link to the improved application and a brief demonstration. It refers to a previous post where the author tested the accuracy of the original GPT-2 Output Detector and found it effectively distinguishes between human-written and AI-generated text. The improvements largely focused on enhancing the user interface and processing capabilities to overcome the token limitation.

How I Improved the GPT2 Output Detector

The GPT2 Output Detector is a powerful classifier that detects ChatGPT text with an impressive accuracy. I extended the detector’s frontend to work for large texts, support custom file uploads, and allow parallel processing of many files.

These new features make the GPT2 detector more usable and accessible. In this blog post I will describe my changes and how they work.

You can try out the application at: https://michaelgithubhype.github.io/Extended-GPT2-Output-Detector/index.html

Here is a full demo:

Background

This project is based on my previous post where I tested the accuracy of the GPT-2 Output Detector. I found that the tool is accurate — it effectively detects the difference between human written vs ai-generated text.

However, there is an issue. The GPT2 Output Detector is limited to 510 tokens. For long posts, the tool is sensitive to small changes at the beginning. This restricts content detection support to social platforms with short texts. Ideally, the tool would take a holistic view of the content.

The problem with the GPT-2 Output Detector is that the API limits the amount of allowed tokens. Here is the response format once a user sends more than 510 tokens:

Was this article displayed correctly? Not happy with what you see?

See Archived Versions Request Manual Review

Category: Technology

Tags: GPT2 AI detection Text Classification Machine Learning Natural Language Processing

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

If you often open multiple tabs and struggle to keep track of them, Tabs Reminder is the solution you need. Tabs Reminder lets you set reminders for tabs so you can close them and get notified about them later. Never lose track of important tabs again with Tabs Reminder!

Try our Chrome extension today!

Add to Chrome

Save As Favorite

Add To Reading List

Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Twitter/X

WhatsApp

Facebook

Save articles to reading lists
and access them on any device