some word lists for bot-making

After a couple years of making bots, I’ve made a fair few wordlists that I thought other people might be able to make use of, so I decided to release them. These are provided under a Creative Commons Attribution-NonCommercial 4.0 International License; if you want to use them commercially, please contact me and we can work something out.

I’m just providing these as lists in txt files; I use delim.co to turn them into Tracery code.

 

a big list of bot-making resources

I’ve been talking about how easy it is to get started making bots for a while now, but I haven’t done a long-form writeup of all of the tools that I use that you might find useful. So here we go: a list of resources that I use for making Twitter bots.

Tracery Basics

Non-Tracery Tools

Word Lists

List inspiration

  • The Noun Project I never use any of the Noun Project stuff directly, but they have a lot of lists of icons of various objects put in useful categories, which can be good for when you need to think about, say, common appliances, but it’s hard to name them all off the top of your head
  • Spellzone word lists These are for learning English, but can also be useful for when you need a bunch of words for certain things, like buildings, appliances, household objects, etc; I usually just keep a notepad tab open and write down a handful that will work for what I’m doing
  • Wordnik I often will pick a fun sounding word, look at what lists people put it on, and then look at those lists for more word inspiration

Text/List Editing Tools

Images/Graphics

Miscellaneous Guides

Further Reading

What’s with @thinkpiecebot lately?

It’s been over a year since I started @thinkpiecebot, and it’s grown a LOT in the interim. The code is a huge, often unwieldy mess, there are over 100 formulas in it, and in the past couple of months the scope and the tone has gone through some… changes.

It was a rough election year, and since we elected a white supremacist sexual predator as the next president of the United States, it’s only gotten worse.  We’re watching a fascist gather military leaders around him and fuck up foreign policy for his own profit. Abortion bans gaining traction, hate crimes going up, reasons to crack down on the poor, on POC, on trans people, all going into overdrive.

It wasn’t good to begin with. At any given time, I know about a dozen people crowdfunding their medical expenses, and more people their other survival needs.

Meanwhile, the discourse has been absolute garbage. We keep seeing fawning profiles and photo shoots with the ringleaders of hate groups, normalizing them. A bunch of dudes on the left have managed to normalize a political environment that mostly consists of them taking screenshots of marginalized leftists on Twitter and making jokes about and/or harassing them; to add insult to injury, the jokes aren’t even very good. This happened. Political propaganda and coordinated disinformation campaigns are being called “fake news” and confused with The Onion, which also isn’t very good anymore. Dudes keep calling for “civility” and “respect” for the people who supported our fascist president-elect. The Daily Show gave a white supremacist a platform that got her a NY Times puff piece, then sent her cupcakes. That’s just a sample.

Personally, I’ve been dealing with a group of stalkers that a prominent “feminist activist” sent on me. I have various groups of people, some of them trans, showing up to tell me my gender isn’t real at least once a week now. Throw in some medical problems and mostly I’m mostly a ball of exhaustion and low-simmering rage that happens to take human form and make Twitter bots, and if I look at my friends, I see that I’ve still got it pretty good.

When I started thinkpiecebot, I was mostly making fun of terrible op-eds by baby boomers where they pretend they didn’t ruin the economy and fuck over our entire generation. Lately, though, I’ve been pretty much channeling all of my frustration and rage at the current state of political discourse, the utter failures of journalism to cover what this is– a full-scale fascist revival, and the mostly cishet white dudes with platforms who refuse to step aside and let people who aren’t so privilege-blinded that they don’t know what they’re talking about do their jobs. The total spinelessness of both the liberal media and the Democratic party has been predictable but disappointing, and I’m seeing a lot of the people I care about be thrown under the bus in an attempt to be conciliatory with neo-nazis. And, well, this bot is the platform I have, and the headlines are still terrible enough that I can riff on that.

Thinkpiecebot is built in Tracery, and it works by having a series of formulas with lists of things that get plugged in. For example, “[EXCLAMATION], A [MODIFIER] [BIGOT TYPE]” can make tweets like “Holy Shit, A Suit-Wearing Racist”. Most of my formulas have pretty long lists of words that can get plugged into them, but I’m at a point where I have over 100 formulas, so I can put in a few that have a more limited selection of possible phrases and not worry about them naturally coming up more than every few days. I’ve been putting in a lot more formulas that are extremely pointed criticism at specific types of stories.  

This isn’t an accident, and it isn’t a bot that learns automatically picking something up. This is me, attempting to use satire to point out something that the media is utterly failing at, assisted by a random number generator. Sometimes it works really well. For example:

What’s great about the evolution of thinkpiecebot is that a lot of the pieces I was originally mocking were about various forms of completely trivial bullshit, from cupcake shops to comic books, and I’ve been expanding it for over a year. That’s a year of reading terrible thinkpieces, taking suggestions from people on Twitter, and sending drunk emails to myself while my friends are in the bathroom to remind myself to put the stuff we were talking about in when I get home. It means there’s over 1,500 unique phrases in the bot that get plugged into the 100+ formulas.

I can’t hold all the stuff that goes in there in my head at a time; I’d never have thought to put “Cupcake Shops” in the above tweet. That’s the power of the random number generator. When you combine it with phrases like “The United States Electing A Sexual Predator As President” and “Headlines That Are Too Fucking Precious To Use The Word ‘Fascism'” that are obviously written by me in my voice, the bot starts to sound less like a bot and more like me.

The bot’s less funny now. But sticking with making fun of the kind of shitty thinkpiece that was bugging me in the middle of 2015 would have just made it slip into irrelevance, and from the beginning, thinkpiecebot has been a living project that expands and changes. I’m longing for a time when the thinkpieces that made me mad were just boomer nonsense about Pokémon, but I think this bot is gonna keep getting darker for a while ago. Maybe it will be funnier if we can fix the world.

the saga of @christianmom18

Over the past few months, I’ve been making a handful of “honeybots”– bots that act as a honeypot for Twitter troll. There are a lot of people on Twitter who search for specific terms and then yell at people who mention them; they go on about topics that range from chemtrails and the flat earth to various alt-right people with cult followings to atheism. A handful of bullies particularly like to search for people who mention them negatively and then retweet those people to their followers, leading a harassment mob to their virtual door.

I’ve been documenting these on a Tumblr blog, and they caught the attention of the Washington Post, among other places. Sarah Nyberg made an advanced, improved honeybot that is even better at catching trolls and wasting their time than mine. It’s been a fun couple weeks for automatically wasting the time of assholes.

One of the groups of people who have been really easy to bait and really plentiful are “internet atheists”. If you’ve spent much time adjacent to the atheist community, you’ve probably run into these– their identity revolves around their atheism; they generally hate religion of all stripes, and they roam the wilds of the web trying to pick fights with people. They tend to have a personal philosophy that revolves around the idea of skepticism, but aren’t very good at embracing its principles themselves– they tend to have “scientific” justifications for all of their personal biases and often express a good deal of transphobia, racism and misogyny.

What these people seem to like more than anything else is to fight their idea of “theists” without actually having to engage in ideas. They want a straw Christian to use as a punching bag. So I made them one.

Enter @christianmom18, AKA “carol”.

Carol is a bit of an improvement on @good_opinions and @opinions_good. She has a lot more basic things to say as “bait”, and I made her reply occasionally with those instead of just with the 17 “argument” phrases– “you are wrong”, “check the bible”, “no”, etc. I also gave her a chance of “correcting” with “*your” anytime someone said “you’re” and vice versa, regardless of whether or not they were using the word correctly. (Another great Sarah idea.) I also set her up to auto-retweet some Bible quotes accounts and the official accounts of both Kellogg’s cereals and the New York Yankees.

carol1

I got some help from Sarah and some of my gaming friends to come up with this stuff and had “carol” stick emoji and random punctuation on the end of the phrases so that they didn’t get caught by Twitter as duplicates, and practically immediately started getting bites.

(I actually removed the “going to hell” thing from carol’s vocabulary after this– I realized that was kind of an asshole thing to say, even if the only people who saw it were jerks.)

A lot of these accounts are people who spend a lot of time searching Twitter for terms like “atheists” to find people to dunk on, and they often follow each other, so “carol”‘s posts quickly spread among that network, and people started talking to her.

At length.

carol3 carol4 carol5That’s just the beginning. Richy went on to talk to “her” for about three hours— through several repetitions of her “arguments”.

Some people figured out that she was a troll, but not that she was a bot.

carol6

Carol’s maturity was called into question.

carol7She was told she was a bad mother.

carol8

 

She was insulted:

carol10

And this is just the beginning. After this, “carol” was quote tweeted by a popular atheist account that seems mostly to spend time dunking on theists, and her followers came in droves.

carol11Check out how smug these people are:

carolsmug2

carolsmug1She was ‘splained to pretty much endlessly:

carol9

carolsplainAfter about a day of “Atheist Girl”‘s followers talking to carol– a few of them figured out that “she” is a bot, but not many– I started noticing that they were even taking the really obvious bait.

carol12

carol13By this point, Twitter’s automated system had shadowbanned “carol”, so her posts weren’t showing up in Twitter search results– all the people talking to her found her via other people’s retweets, quote tweets and conversations. So I got some help from friends to come up with really wacky things to say.

carol15This is just the tip of the iceberg: search Twitter to see all “her” mentions. I also shared her source code, though it’s not free for reuse without permission.

on tessera(e)

 

The Chapelle Palantine, illuminated by sunlight and candles.

By Urban~commonswiki; used under a CC licence.

 

I wrote a piece recently for BotWatch that touched on this, but I wanted to give a brief explanation of why I use the word “tessera” when I talk about my bots.

A tessera is a single tile used in a mosaic; every tessera is suspended in mortar, and depending on where the mosaic’s viewer is standing and the lighting conditions, different tiles may be illuminated. Guy Gavriel Kay explains in Sailing to Sarantium:

“The curve and the height of a dome allow us the illusion of movement through changing light, my lord. Opportunities beyond price. It is the mosaicist’s natural place. His… haven. A painted fresco on a flat wall can do all a mosaic can, and-though many in my guild would call this heresy-it can do more at times. Nothing on Jad’s earth can do what a mosaicist can do on a dome if he sets the tesserae directly on the surface.”

a list of words from @thinkpiecebot

Some of @thinkpiecebot’s tesserae.

My bots have lots of words and phrases in them and they’re set in formulas, what the bot says is chosen pseudorandomly, just like the flicker of a candle may psuedorandomly “choose” different tesserae to illuminate, and how their flickering and winking will be different for people standing in different places.

In some ways, I’m creating the illusion of human thought where there isn’t any; which of my tesserae is “illuminated” depends on what the random number picker that puts bot tweets together happens to land on. Just like changing light creates the illusion of movement, the random numbers create the illusion of a mind behind what the bots post.

how and why @nerdgarbagebot works

As usual, I’ve made a few new Twitter bots since my last update, but I want to talk a little bit about how a particular one– @nerdgarbagebot— works.

a screenshot of @nerdgarbagepitches

I’ve been talking a lot on Twitter about how simple bots can be to set up. I use Cheap Bots Done Quick and the Tracery visual editor for mine, so I hardly have to look at code when I’m writing them– my screen actually looks like this:

a screenshot of the tracery visual editor

Both of these are free tools that you can use to make your own bot; all you need to do is register a new Twitter account to get started.

The results of nerdgarbagebot are pretty varied and rich, but the code is really simple– it only actually has 13 variables in it:

  • Work titles (“Jurassic Park”, “The Sims”)
  • Creators (“George Lucas”, “Nintendo”)
  • Elements (“mermaids”, “a plot”, “interpersonal drama”)
  • Comparative adjectives (“grittier”, “with swears”, etc)
  • Formats/mediums (“tv show”, “installation art piece”, “webcomic”)
  • Genres (“gothic”, “cyberpunk”)
  • Settings (“on the high seas”, “in a modern high school”)
  • Story types (“coming of age story”, “mystery”)
  • Characters/People (“Yoda”, “Joseph Gordon-Levitt”)
  • Character roles (“mentor”, “president”)
  • Audiences (“tweens”, “atheists”)
  • “Imagine this” intro phrases (“Imagine”, “I need you to picture”)
  • “Fund this” intro phrases (“Crowdfund this”, “Please support my”)

For each tweet, Tracery at random chooses from a list of formulas, which are written like this:

  • It’s a #genre# version of #titles#, with #element#.
  • #imagine# a #format# version of #titles#.
  • #imagine# a #genre# #format# version of #titles#, but with #character# as the #role#.

I’m increasingly realizing that the reason bots like this work really well– compared to ones like @BuzzFiendNews, which I am still struggling with improving– is that even though the format is simple, every phrase is something that the bot’s audience brings their own baggage and associations to, so the tweets tend to be more unique in and of themselves.

BuzzFiend is a lot harder to add variation to, because even though each tweet has different words in it, every tweet using a specific formula tends to be similar to others. A joke about eating humans tends to be pretty similar to other jokes about eating humans, whether those humans happen to be sidekicks or princes.

Thinkpiecebot, nerdgarbagebot and some of my other bots, especially @likeuberbut, manage to capitalize on people’s existing ideas. Thinkpiecebot’s funniness in particular comes from the unexpected combinations that it produces being put into the recognizable headline format, but doing that ended up being complex– I have over 50 formulas in it and nearly as many variables, and I’m constantly updating it so that it keeps up with the zeitgeist.

Nerdgarbagebot is brand new, and I’ll probably have to continue updating it to keep things current, but I’m having a lot of fun with it– I hope you like the results too!

Make your own @hydratebot: a tutorial for non-coders

So @hydratebot has become pretty popular, and people keep requesting different frequencies of tweets, having themselves @-ed in the tweets, etc, and I’ve been looking for a way to show y’all how easy Cheap Bots Done Quick is so you can set up your own bots. So I’m gonna kill two birds with one stone here and show you how to make your own @hydratebot, which you can modify as you want; just credit @NoraReed in the bot profile.

The first step is creating a new Twitter account. Log into it, then go to Cheap Bots Done Quick and click on the “Sign In With Twitter” button.

a screenshot of the Cheap Bots Done Quick page with an arrow pointing to "sign in with Twitter" and all caps "THAT BUTTON"

You’ll need to authorize the app to post from that account.

authorize

You’ll get a big text box. Copy and paste the following into it:

{
“please”: [
“please”,
“i command you to”,
“you must”,
“go now and”
],
“have”: [
“have some”,
“drink some”,
“consume”
],
“water”: [
“some water”,
“water”,
“fluids”,
“liquid”
],
“address”: [
“human”,
“friend human”,
“human friend”,
“hu-man”,
“meat-creature”,
“meat-friend”,
“squishy human friend”,
“biofriend”,
“biological being”,
“my friend”,
“my hu-man friend”,
“physical entity”
],
“thanks”: [
“thank you”,
“thank you #address#”,
“thanks”,
“thanks #address#”,
“thank you from a robot who loves you”
],
“hydration”: [
“become hydrated”,
“put water into your mouth-hole”,
“consume this \”water\” that humans require to live”,
“drink hydration”,
“put water into your body so that it will function”,
“drink water so you can maintain your physical form”,
“put liquids into your mouth-hole”
],
“timeto”: [
“it is time to”,
“you must”
],
“standardorigin”: [
“#please# #have# #water#, #address#!”,
“#address#! #please# #have# #water#. #thanks#.”,
“#address#! #timeto# #hydration#.”,
“#timeto# #hydration#, #thanks#.”,
“#address#, #please# #have# #water#.”
],
“origin”: [
“#standardorigin#”
]
}

Now, set it to the frequency you want and click “save”.

a screenshot of the Cheap Bots Done Quick homepage with arrows pointing at "frequency" and "save" and "WOO" written above them

Want it to @ you? Scroll down to the bottom, where it says “origin”. Insert your Twitter handle before #standardorigin#. This is handy if you want it to take advantage of Twitter’s @ notifications.

More Advanced Stuff

If you want to change it in other ways, Tracery’s visual editor is pretty easy to use; just click on “JSON” and copy and paste the code there, then click “JSON” again to get an easy interface:

a screenshot of the Tracery visual editorGalaxyKate has Tracery tutorial if you want to do anything advanced with it.

This particular bot has only a single statement for #origin# so that it’s easy to add handles to it without having to modify every #origin# phrase.

You can also use IFTTT to easily connect your tweets to hundreds of other services.

That’s it! Feel free to contact me on Twitter if you have any questions. If you found this useful, consider buying me a coffee or supporting my Patreon. Thanks! Happy hydration!

an update on my work

I’ve been busy working on a bunch of projects, but I haven’t posted in a while, so here’s an update on what I’ve been up to!

I made several new bots:

I’ve also been adding a lot to @thinkpiecebot, which is getting HUGE– the code has over a thousand variables in it now– that means there are over a thousand different things that it forms into different phrases, all of which I put in there by hand. (There are a couple that are repeated to tweak the frequency so it doesn’t use “Whippersnapper” more than it uses “Millennial”, but still.) It’s also been getting a lot of attention: The Daily Dot used a bunch of its tweets for writing prompts, I talked to Slate and Recode for short features did a longer interview with PopMatters; it also got a writeup in Bustle.

I’ve also been having more success with Patreon and PayPal tips, which has allowed me to spend a lot more time making this art stuff: thanks to those of you who’ve helped me out with that! It’s nowhere near enough for me to do this as a full-time job in the long run, but it’s enough for me to think that might someday be possible.

The Official @Thinkpiecebot FAQ

I made @thinkpiecebot a couple weeks ago and it has really taken off; it already has more than twice as many followers as me. I’ve been interviewed about it twice already and have also gotten a lot of questions on Twitter, so I’m putting together the common ones so I don’t have to keep answering them over and over.

What do you use to make your bots?
All my bots except @NoraReedEbooks and @NORBORG_ebooks use Cheap Bots Done Quick, which runs on  Tracery. I set up @NoraReedEbooks using this tutorial; @NORBORG_ebooks was built by @iglvzx; he made a tutorial on setting up your own.

How do they work?
Each bot has a series of formulas that it picks from at random and inserts words from predetermined lists. @Thinkpiecebot actually has two levels of these: the main formulas, such as “Do [GENERATIONAL GROUP] Really Love [RANDOM WORD/PHRASE SELECTED FROM ANY CATEGORY]?”, and a top-level formula that puts a publication prefix in front of one in six tweets.

As of this writing, @thinkpiecebot has main formulas and 25 variables. Some of these variables don’t include very many options: the formula that created the above tweet grabs the verb– “cure”– from a list with only two available options, “cure” and “cause”.

What inspired @thinkpiecebot? I’m a millennial, and I’m incredibly frustrated by articles written by people outside of our demographic attempting to explain us and doing so badly. You can’t throw a proverbial stone in the internet-news-o-sphere without hitting an article talking about how hypersensitive and vain we are. Boomers offer their Dunning-Kruger driven takes on trigger warnings, conveniently ignoring the freely available information on how PTSD triggers and exposure therapy actually work. They ask questions about why we don’t do things that require money, like have big weddings or buy houses, and come up with ridiculous reasons involving how we got too many awards as a kid as reasons instead of realizing that their generation completely ruined the economy. @Thinkpiecebot is a way to call out the predictability of these articles, as well as a lot of other kinds of ridiculous output, and the humor of it is a way to cope with the fact that people keep writing them and keep defining my generation by the trumped-up bullshit in them.

You’re really down on Boomers and capitalism. What’s with that?
Capitalist culture attempts to tie our ideas of self-worth to our economic output, and millennials have largely been forced into emotionally and physically draining dead-end jobs that underpay us, if we’re employed at all.

As a generation, we’re struggling to survive in the world that Boomers managed to completely fuck up, and they’re getting paid to write columns on how degenerate we all are for taking selfies. My whole life, I’ve been seeing the output of my generation shat on by people who can’t even be bothered to understand it.

From these people’s perspective, Twitter was a platform for self-obsessed 20-somethings to talk about what they had for breakfast, but after my generation figured out how to use it for large-scale political activism and to connect people to conversations that never would’ve existed, THEN they’re happy to get accounts to promote their “brand”. They’re happy to roll their eyes at fandoms that are creating enormous quantities of creative material and inspiring new writers and artists to make things for their own satisfaction and to share with their communities. They’ll complain about new gender identities and sexual orientations, never realizing how much of a balm to isolation it can be to have a word to describe how you are and to be able to connect to people who feel the same way.

How did you come up with the material for @thinkpiecebot?
Most of it is words and phrases I came up with while looking at horrible thinkpieces, but I got a lot of help from my Twitter followers. They did particularly invaluable work with helping me phrase some of the issues regarding marginalization and privilege; I wanted to be sure that wasn’t falling into doing “ironic bigotry”, and they helped a lot with coming up with specific phrasings that wouldn’t harm groups who are already being targeted by actual thinkpieces.

Does it run on its own?
I have it set up to post every hour, but sometimes when I add new stuff I have it post a handful of tweets using the new formulas/phrases, or when I’m messing with the code and it comes up with a particularly good sample tweet I will have it post that because it made me laugh.

So you’re still updating it?
I keep thinking of new things to add, so yeah. I’m guessing I will stop eventually, maybe once my cutting satire becomes so popular that everyone stops writing thinkpieces in shame.

I would like to pay you! How do I do that?
I have a Patreon and a PayPal tip jar. Thanks! Your contributions allow me to keep working on new bots and keep improving @thinkpiecebot!

Is @thinkpiecebot open source?
I’ve considered open sourcing my bots, but I am concerned that if I do that, men will do things with them. As soon as someone makes an open source licence that only allows use by women and non-binary folks and forces men to ask my permission to use my code, I’ll probably release it.

Update 5/4/16: I’m now sharing the code of TumblrSimulator for people to view to see how it works, and hydratebot is licenced to be shared if you’re interested.  I share code excerpts with people who ask, but after being updated for nearly a year, @thinkpiecebot is kind of a behemoth; it wouldn’t be very useful as a learning tool, because it’s kind of a kludgey mess on the back end.

Are you serious? Isn’t that… misandry?
¯\_(ツ)_/¯

Why did you block me?
I share my personal blocklist with my bots so that it’s harder for people to harass me via those accounts. As an outspoken feminist, I’m a regular target for online abuse. I might notice if you tweet @ it asking nicely to be unblocked, but it’s my bot, and I get to choose if I don’t want people to have access to it.

Is this really a bot?/Don’t you at least hand-pick the best ones and schedule them?
Yeah, it is, it just seems more coherent than lots of the bots you’re used to because it’s formula-based, not using Markov chains or other, similar techniques. The hourly tweets– the ones that tweet at :11 after the hour– are totally automatic. I do occasionally do tweet-bursts when I add new content, and I pick which of those tweets go up; I also sometimes tweak the code a bit so that new stuff is more likely to come up. The only tweets I hand-write are the ones where I ask for money.

Where else can I follow @thinkpiecebot?
I recently set up a Tumblr for it; it cross-posts tweets from Twitter over there too.

Why did @thinkpiecebot just tweet a bunch of times in a row?
I sometimes do tweet-bursts when I add new content. It’ll stop in a minute.

Will you add ________ to @thinkpiecebot?
Maybe; I do take suggestions that are tweeted to @NoraReed. However, there are a lot of places I don’t want @thinkpiecebot to go because they end up way too close to just parroting the people the bot is meant to make fun of. I’ve taken things out that make jokes that are too close to punching down and/or being “too real” before– namely “AIDS”– because they just felt like what happens when you play Cards Against Humanity or MadLibs with assholes.

What other work do you do?
I run a both my personal blog at barrl.net and What Is GamerGate Currently Ruining; I also tweet as @NoraReed and have a bunch of other Twitter bots. (Here’s a full list of my essays, games and other projects.)

Do you take interviews?
Usually yes! If you aren’t paying me– which is fine– I’ll want you to include links to ways your readers can do so, because I’m an artist, and I need money for burritos, which I metabolize into more bots.