Making a 4chan Twitter Bot with Groovy in 8 Easy Steps

First off, if you’re reading this you’re probably already wondering: why would anyone want to make such a thing?

“Just what is wrong with you?” might be one of the thoughts racing through your head right now. And I understand, but I’ll ask you to suspend your disbelief if only for a minute and indulge me for one moment.

I Promise I’m Not Evil

As it stands, 4chan is one of the most unique places on the Internet, with a long and rich history of angst-fueled mischief, wildly irreverent humor, unabashedly offensive locker-room banter, the occasional problem with law enforcement, and much, much more.

However, more importantly, I believe it has also acted as an incubator for mainstream Internet culture, providing us with countless memes, phrases, and attitudes that either directly or indirectly shape the sort of behavior we can expect to find both online and offline.

Therefore, and especially for those of us with a fascination for culture, wouldn’t it be interesting to reach into the “brain” of this dynamic, anonymous hive mind and grab just one momentary thought?

Wouldn’t it be worth it to sift through all the erratic noise to find one genuinely thought-provoking gem of commentary? Or… at least, a sincerely silly one!

That’s where the idea for this project came from.

And if you just want to see the real-life application of it, hop on over to Twitter to take a look at @r9k_b, a 4chan Twitter Bot that runs on this very same source code to tweet original user-content from 4chan’s /r9k/ board.

Getting Started

Now, without further ado, let’s dive into the code. With this, you’ll be able to make your own Twitter Bot to pull tweetable commentary from any 4chan board you’re interested in.

You’ll be using the Groovy programming language to write a simple .groovy one-off script. If you already know Java, you basically already know Groovy. If not, now might be a great time to get familiar with it.

Groovy is extremely fun, flexible, and productive to use. Plus, it will grant you access to the entire Java ecosystem of tools and libraries.

If you’re not interested in the process of building the bot, you can go on ahead and view the complete source code directly at the r9k_b GitHub repo.

What You’ll Need

  1. An Installation of the JDK and the Groovy SDK
  2. A Twitter Account
  3. The 4chan API
  4. The Jsoup Library
  5. The Twitter4j Library

First Things First

Before you attempt to proceed with making a Twitter Bot, you first need to register for Twitter’s API. Make sure your account has a valid phone number (for account activation) associated with it, otherwise you will not be able to use it as a bot.

To achieve this you must:

  1. Visit the Twitter Apps page.
  2. Navigate and click on the “Create New App” button.
  3. Complete the application form.
  4. Move on with the “Create Your Twitter Application” button.
  5. Go to the tab that reads “Keys and Access Tokens”
  6. Click on “Create My Access Tokens” at the bottom.

Now, take note of the following:

  • The Consumer Key (API Key)
  • The Consumer Secret (API Secret)
  • The Access Token
  • The Access Token Secret

With these values on hand, create a twitter4j.properties file in the same directory as your Groovy script. Paste in the following code in the file and fill out the values with your own:

oauth.consumerKey=Your_Consumer_Key
oauth.consumerSecret=Your_Consumer_Secret
oauth.accessToken=Your_Access_Token
oauth.accessTokenSecret=Your_Access_Token_Secret

What You’ll Be Doing

  1. Using Grape annotations to quickly and easily add your library dependencies to your script.
  2. Making a 4chan API request for the catalog of active threads in a board of your choosing.
  3. Parsing the JSON response.
  4. Iterating over the catalog in order to store the thread IDs for future retrieval of each thread’s user posts.
  5. Making another 4chan API request for each of the stored threads.
  6. Parsing the JSON response.
  7. Iterating over each of the parsed threads in order to store the “tweetable” (length <= 140) user posts/comments.
  8. Picking and tweeting a stored user comment to your account.

Step 1: Adding Imports and Dependencies

This code snippet contains all you need to get started with writing the actual code in your script. It uses Grape to fetch the .jar files for you, so you don't have to worry about dependency management and you can focus on writing your code:

import groovy.json.JsonSlurper
import java.net.URL
import java.text.SimpleDateFormat

@Grab(group='org.twitter4j', module='twitter4j-core', version='4.0.5')
import twitter4j.Status
import twitter4j.Twitter
import twitter4j.TwitterException
import twitter4j.TwitterFactory

@Grab(group='org.jsoup', module='jsoup', version='1.9.2')
import org.jsoup.Jsoup

Steps 2-4: Retrieving Catalog and Thread Board Data

These steps require some underlying knowledge of the 4chan API and the intuitive Groovy JsonSlurper class. It is worth mentioning that the left shift operator is overloaded to support addition of elements into a List object.

As implied in the beginning, this example will use the /r9k/ board as a data source.

final String BOARD_NAME = "/r9k/"
final String THREAD_CATALOG_REQUEST_URL = "https://a.4cdn.org" + BOARD_NAME + "threads.json"

println "Retrieving " + BOARD_NAME + " data..."

threadCatalogData = new URL(THREAD_CATALOG_REQUEST_URL).getText() // Makes the HTTP request.
List threadCatalog = new JsonSlurper().parseText(threadCatalogData) // Parses JSONObjects into Maps and JSONArrays into Lists.

// Store all active threads from the board.
listOfThreads = []
numOfPages = threadCatalog.size()
numOfPages.times { i ->
    Map catalogPage = threadCatalog.get(i)
    List threadsInPage = catalogPage.threads
    numOfThreadsInPage = threadsInPage.size()
    numOfThreadsInPage.times { j -> listOfThreads << threadsInPage.get(j).no }
}

Steps 5-7: Retrieving Tweetable User Comments

These steps are very similar to the previous three. Additionally, you will be using a different method to loop for a specified number of iterations. You can find out more about these and other methods used to loop over stuff in the Groovy language here.

listOfComments = []

// Retrieve all posts from 20% of the most recent threads in the board catalog.
1.upto(listOfThreads.size() / 5) { i ->
    chosenThreadNo = listOfThreads.get(i)
    threadPageRequestUrl = "https://a.4cdn.org" + BOARD_NAME + "thread/" + 
        chosenThreadNo + ".json"

    threadPageData = new URL(threadPageRequestUrl).getText()
    Map threadPage = new JsonSlurper().parseText(threadPageData)

    List posts = threadPage.posts

    // Grab all the thread comments of Twitter-able length (with no URLs or web links).
    final int CHARACTER_LENGTH = 140
    numOfPosts = posts.size()
    numOfPosts.times { j ->
        Map post = posts.get(j)
        if (post.com != null) {
            comment = Jsoup.parse(post.com).text() // Removes HTML from comment.
            if (comment.length() <= CHARACTER_LENGTH && !comment.contains("www") 
                && !comment.contains("http"))
                listOfComments << comment
        }
    }

    Thread.sleep(2000) // Respect the 4chan API rules.
}

Step 8: Posting the Tweet

// Choose a comment for Twitter posting.
    chosenCommentIndex = new Random().nextInt(listOfComments.size())
    chosenComment = listOfComments.get(chosenCommentIndex)

    twitter = TwitterFactory.getSingleton()
    twitter.updateStatus(chosenComment)

    Date date = new Date()
    SimpleDateFormat today = new SimpleDateFormat("EEE, d MMM yyyy HH:mm:ss Z")
    println today.format(date) + " - Updated Twitter status with: " + chosenComment

    // Wait for a while (15 mins) until the next tweet.
    Thread.sleep(900000)

Final Step: Putting It All Together

To finish off the script, we wrap all the code logic inside a while loop that continually executes until we decide to close down the console:

import groovy.json.JsonSlurper
import java.net.URL
import java.text.SimpleDateFormat

@Grab(group='org.twitter4j', module='twitter4j-core', version='4.0.5')
import twitter4j.Status
import twitter4j.Twitter
import twitter4j.TwitterException
import twitter4j.TwitterFactory

@Grab(group='org.jsoup', module='jsoup', version='1.9.2')
import org.jsoup.Jsoup

while(true) {
    final String BOARD_NAME = "/r9k/"
    final String THREAD_CATALOG_REQUEST_URL = "https://a.4cdn.org" + BOARD_NAME + "threads.json"

    println "Retrieving " + BOARD_NAME + " data..."

    threadCatalogData = new URL(THREAD_CATALOG_REQUEST_URL).getText() // Makes the HTTP request.
    List threadCatalog = new JsonSlurper().parseText(threadCatalogData) // Parses JSONObjects into Maps and JSONArrays into Lists.

    // Store all active threads from the board.
    listOfThreads = []
    numOfPages = threadCatalog.size()
    numOfPages.times { i ->
        Map catalogPage = threadCatalog.get(i)
        List threadsInPage = catalogPage.threads
        numOfThreadsInPage = threadsInPage.size()
        numOfThreadsInPage.times { j -> listOfThreads << threadsInPage.get(j).no }
    }

    listOfComments = []

    // Retrieve all posts from 20% of the most recent threads in the board catalog.
    1.upto(listOfThreads.size() / 5) { i ->
        chosenThreadNo = listOfThreads.get(i)
        threadPageRequestUrl = "https://a.4cdn.org" + BOARD_NAME + "thread/" + 
            chosenThreadNo + ".json"

        threadPageData = new URL(threadPageRequestUrl).getText()
        Map threadPage = new JsonSlurper().parseText(threadPageData)

        List posts = threadPage.posts

        // Grab all the thread comments of Twitter-able length (with no URLs or web links).
        final int CHARACTER_LENGTH = 140
        numOfPosts = posts.size()
        numOfPosts.times { j ->
            Map post = posts.get(j)
            if (post.com != null) {
                comment = Jsoup.parse(post.com).text() // Removes HTML from comment.
                if (comment.length() <= CHARACTER_LENGTH && !comment.contains("www") 
                    && !comment.contains("http"))
                    listOfComments << comment
            }
        }

        Thread.sleep(2000) // Respect the 4chan API rules.
    }

    // Choose a comment for Twitter posting.
    chosenCommentIndex = new Random().nextInt(listOfComments.size())
    chosenComment = listOfComments.get(chosenCommentIndex)

    twitter = TwitterFactory.getSingleton()
    twitter.updateStatus(chosenComment)

    Date date = new Date()
    SimpleDateFormat today = new SimpleDateFormat("EEE, d MMM yyyy HH:mm:ss Z")
    println today.format(date) + " - Updated Twitter status with: " + chosenComment

    // Wait for a while (15 mins) until the next tweet.
    Thread.sleep(900000)
}

Now that you've got the whole script, save it (if you haven't already) as filename.groovy. You can execute it with a groovy filename command or simply give it a double click to run if you installed the Groovy SDK properly.

Congratulations! You now have your very own operational 4chan Twitter Bot.

Now sit back, run the script, and be prepared to be made very uncomfortable, repeatedly.

Update: The original r9k_b script has been Groovy'fied by none other than GR8Conf founder Søren Berg Glasius. You can view the updated script over in the main branch of the r9k_b repo.

3 thoughts on “Making a 4chan Twitter Bot with Groovy in 8 Easy Steps

    1. Søren, it’s great to have you here!

      I love your version of the script, it’s so much more groovy. Definitely a great learning experience for those of us who are still in the Java mindset of things.

      Thanks a lot. Much appreciated.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s