I was tweaking around with some Nginx host proxy configurations and discovered a juvenile, but neat trick in decloaking chat bots that prefetch URL’s in chat channels.

Internet Relay Chat (IRC) is an Internet protocol that is used by millions of diverse users worldwide (developers, businesses, private communities, etc). IRC bots are generally scripts designed reside in an IRC channel(s) and carry out various automated tasks as controlled by their bot master. These tasks can be anything from preventing a hostile takeover of the channel, automated kicks of users who flood the channel with spam or other content, tell jokes, provide local weather forecasts to channel users, and so forth. The bot resides on the chat server, appearing just as any other user in the channel.

For the most part, they are an essential asset to the IRC channel users and/or operators that run them, and generally are harmless.

A common feature of these bots is a URL fetching mechanism. When a user types in a link to the channel, the bot will visit the web page autonomously, and return the page title to the channel so that the channel users can get an idea of the content behind the link.

It is common practice to cloak one’s IP address/hostname for reasons of security and anonymity on IRC. The bot has its own IRC server account, and its program is run on a computer or server just as everyone else and must be protected through obscurity. The problem for the bot (and the bot master), is that there are methods to decloak the IP address and bot software/version, thus compromising the anonymity and potentially opening up an attack vector on the system where the bot resides. Show below is an example of how this is done.


Nginx URL Rewrites

Nginx is a light and easy to configure web server and reversy proxy. Using the $http_user_agent directive in the Nginx configuration, we can use an expression that will easily perform a URL rewrite if the given expression matches the bot’s user agent.

For this example, I am simply using the term “bot” as my regex. This configuration block will effectively match any bot’s user agent that contains the word “bot” (caution, as “bot” will match any user agents with bot in the string), and rewrite the URL to htp://website.com/bot.php:

Using PHP to Get The IP Address & User Agent

The next step is to create the page bot.php with the following line of code:

There’s obviously several ways this could be done in several different languages, but ultimately this will decloak the bot’s IP address and reveal its software & version in the user agent. Tested against Supybot in an IRC channel, the result is:

In the end, if you run a channel bot and do not want the IP address and user agent to be decloaked, disable any URL fetching mechanisms; it’s probably in your in best interests *not* to allow your bot to fetch whatever random webpage someone pastes in your public channel (i.e. XSS, illegal content, etc).

Share →