A Gotcha Using Node.js + Request In a Daemon
13 April 2012I have a Node.js program running as a daemon on a Linux VPS. Periodically, it polls a list of URLs using request. When it first starts, everything runs smoothly. But after running for a while, it starts getting 400 errors, and the longer it runs, the more URLs return 400 errors.
I could not understand what was going on. My code was basically structured like this:
Given that code, we know the req object is initialized with each function call. So, how could this script degrade over time?
Well, I finally tracked it down: COOKIES!
Yup, request has cookies enabled by default. So, I think what was happening was that cookies were being set (presumably, top domain-level cookies having the same name at different URLs or subdomains on the same domain) but the values in request's cookie jar were not being returned properly. That means the remote host was getting invalid cookies -- hence the 400 response for a "Bad Request."
I haven't yet spent the time to figure out if this is a bug in request. It's on my TODO list.
In the meantime, I've disabled cookies in the req object:
var req = { url: url, timeout: options.timeout, jar: false };
It's now working as expected.
Notes On Creating A Multi-user Feed Aggregator
6 April 2012Some time ago, I answered another user's question on Stack Overflow about database design for a multi-user feed aggregator. I also received an email from a developer asking for additional input, which I shared. But I thought I should put my response here, as well, for posterity's sake if nothing else. Note that my comments here assume MySQL as the database but should apply to any SQL database.
Basically, my emailer asked what to do about the fact that the posts table will get huge very quickly if we have multiple users and a row for each post for each user. It's actually a pretty basic relational database scenario, but if you start your project as a single user application and later decide it's going to be multi-user, you may not realize that you probably need to completely redesign your database.
So I've posted the schemae I use for my multi-user feed aggregator (a private project):
I've also posted a sql command that you can run in a cron job to remove read posts that are more than 14 days old:
As an aside, I'm amazed by how many people are writing new aggregators. Is it a common programming class exercise to write a feed aggregator or something?
Fun with River2
11 February 2011I decided to install Dave Winer's River2 to supplement my usual feed reading. Now that I can access it via its smart use of Dropbox, it should be good for feeds that I don't feel like I need to see every headline.
One of the things I love about River2 is that it's an app that runs in the OPML Editor, which means that it is endlessly hackable and (apropos to this post) you can fix your own bugs.
So here's a bug report. And fix. (Actually, it could be a workaround for a bug in another application, as I explain below).
- What I was doing: From the
Tools > River2 > Pagesmenu, I selected a page to view (any one, it's the same bug no matter which page). - What I expected to happen: I expected the selected page to open in my default web browser, Pale Moon (a Windows-optimized build of Firefox)
- What actually happened: Nothing. Not even an error dialog.
I immediately suspected that the problem was the communication between the OPML Editor and the Pale Moon browser. After all, there was a major bug for the longest time in Firefox's DDE implementation that required a workaround.
Bottom line: the OPML Editor's DDE implementation expects that the DDE service name is the same as the name of the executable with the filename suffix removed. So, for Excel, the service name is "excel," and for Firefox it's "firefox." But the service name is determined by the application, and the Pale Moon developers decided that its service name would be "Pale Moon," not "palemoon." A simple patch to system.verbs.builtins.webBrowser.openURL resolves the problem.
if string.lower (id) contains "palemoon" { // 2/11/11; 12:09:06 AM by DJM
ddeName = "Pale Moon";
return (webBrowser.callBrowser (ddeName, "WWW_OpenURL", s+",,0,0,,,,"))}
The function webBrowser.callBrowser expects ddeName to be the name of the executable, from which it attempts to remove the ".exe" suffix. Luckily, if the function is passed any string without an ".exe" suffix, it just accepts the passed string as the DDE service name.
Here's the full context:
That ",,0,0,,,," nonsense is part of the DDE message that Pale Moon expects:
Delete Empty Folders
17 September 2009I recently found that I had a lot of empty folders in my MP3 folder after a wayward ripping session. So I whipped up this quick DOS one-liner to remove all empty folders.
From a command prompt, just change to the folder containing all the empty folders and enter the following:
FOR /f "tokens=*" %G IN ('dir /ad /b /s') DO rd /q "%G"
The command "rd /q" will be executed on every folder, but "rd" only deletes empty folders -- "rd" does not delete non-empty folders.
XMPP vCard Python Script
3 June 2009Couldn't find a script to update my Jabber/XMPP vCard photo (a/k/a avatar), so I wrote one. It requires xmpppy (a/k/a python-xmpp). It should work with gTalk, but I have not tested it.
Credit to pastebin for some code snippets.
Hope this saves someone some time and effort.
#!/usr/bin/python
'''vcard.py - Update your XMPP vcard photo with the image you provide
Usage: vcard.py image_file jid password
'''
from xmpp import JID, Client, Iq, Presence, NS_VERSION, NS_VCARD
import sys
import os
import time
from base64 import encode, decode
from hashlib import sha1
try:
file=os.path.expanduser(sys.argv[1])
jid=sys.argv[2]
password=sys.argv[3]
resource='vcard'
except:
print >>sys.stderr, __doc__
sys.exit(2)
NS_VCARD_UPDATE = 'vcard-temp:x:update'
NS_NICK = 'http://jabber.org/protocol/nick'
def hash_img(img):
return sha1(img).hexdigest()
def base64_img(img):
return img.encode('base64')
def get_img(file):
try:
os.stat(file)[6]
fh = open(file, 'rb')
img = fh.read()
return img
except Exception, e:
print >>sys.stderr, e
sys.exit(2)
def get_mime_type(file):
try:
ext = file[-4:]
if ext == '.png':
mime_type = 'image/png'
elif ext == '.gif':
mime_type = 'image/gif'
elif ext == '.jpg' or ext == '.jpeg':
mime_type = 'image/jpeg'
else:
raise ValueError, "Wrong mime-type detected. Check file suffix."
except ValueError, e:
print >>sys.stderr, e
sys.exit(2)
return mime_type
def send_vcard(conn, base64_img, mime_type, nick):
iq_vcard = Iq(typ='set')
vcard = iq_vcard.addChild(name='vCard', namespace=NS_VCARD)
vcard.addChild(name='NICKNAME', payload=[nick])
photo = vcard.addChild(name='PHOTO')
photo.setTagData(tag='TYPE', val=mime_type)
photo.setTagData(tag='BINVAL', val=base64_img)
conn.send(iq_vcard)
def send_presence(conn, status, hash1, nick):
presence = Presence(status = status, show = 'xa', priority = '-1')
presence.setTag(name='x',namespace=NS_VCARD_UPDATE).setTag(name='photo',namespace=NS_VCARD_UPDATE).setData(hash1)
presence.setTag(name='nick',namespace=NS_NICK).setData(nick)
conn.send(presence)
if __name__ == '__main__':
img = get_img(file)
j=JID(jid)
cl=Client(j.getDomain(),debug=[])
conn=cl.connect()
if not conn:
raise Exception, 'failed to start connection'
auth=cl.auth(j.getNode(),password,resource,sasl=1)
if not auth:
raise Exception, 'could not authenticate'
send_vcard(cl, base64_img(img), get_mime_type(file), j.getNode())
send_presence(cl, 'Updated vCard Image', hash_img(img), j.getNode())
time.sleep(1)
cl.disconnect()
On Bootstrapping
13 April 2009On Friday, Dave Winer released a terrific thought-piece-of-a-podcast on how journalists need to learn about bootstraps. In his most recent podcast with NYU's Jay Rosen, he and Jay discussed the topic, as well, but I want to focus on bootstrapping, the metaphor.
Dave offered the well-worn phrase "haul yourself up by your bootstraps" as the mental image we should have when we use the bootstrapping metaphor. Imagine that you're wearing your boots, you grab your bootstraps and pull on them. Well, the best outcome I can imagine is that you'd fail to accomplish anything. If you could accomplish anything, I think all you'd do is pull your feet out from under yourself. But the phrase is supposed to connote (I think) strength by self-determination and self-motivation. That's why MBA-types say they're going to "bootstrap" their start-ups when what they really mean is that their start-ups will be self-funded at the outset.
But I recall learning that before bootstraps became merely decorative, they actually served a useful purpose: namely, to strap your boots to the top of heavy items so you could carry them. Maybe it's apocryphal, but here's MY mental image of bootstrapping: a person on a horse, laden with a pack on its rump, and a heavy wooden storage box strapped to each of the rider's boots.
And that more closely matches what bootstrapping means to me: taking advantage of what's already up-and-running and using that existing momentum to get something else moving.
Character Encoding Help
17 November 2007Tom Morris is pulling his hair out dealing with XML character encoding issues. I've gone through this myself. I found that the SimplePie feed parser has great logic for dealing with this, so I adapted it to my needs in my PHP class XMLParseIntoArray. I think I've expanded on SimplePie's approach a bit, but it's still a work in progress. YMMV. Hope this helps.
PHP command line mode detection
23 August 2007Use a block like this in PHP code to detect whether or not it's running in command line mode as opposed to web server script mode.
// if $_ENV['SHELL'] exists, we're probably in command line mode
if (array_key_exists('SHELL', $_ENV)) {
$this->setOutputMode(MYSQLICIOUS_OUTPUT_CMD);
} else {
$this->setOutputMode(MYSQLICIOUS_OUTPUT_HTML);
}
Juice-y Python
21 June 2007I use Juice to manage my podcasts. But it doesn't do everything I need, and it's a little buggy, and I want to learn Python anyway. So I decided to download the latest source code and see if I could fix some of the bugs I've noticed and figure out how to extend it to do everything I need.
So step one was just getting to a point where I could compile it. The source code documentation is incomplete, so here's what I did, starting from scratch.
- Install Python 2.5.1
- Install pywin32 (I'm on Windows)
- Install mfc71.dll (needed by pywin32)
- Install py2exe (needed to compile Python source code to executable bytecode)
- Install wxPython (for the gui)
- Install pysqlite (may not be necessary, but I knew I'd need it eventually)
- Install NSIS (Nullsoft Scriptable Install System)
- Install NSIS FindProcDLL plug-in
PHEW!
After all of that, it was actually fairly easy to build and install. However, I made the mistake of trying to upgrade the Universal Feed Parser, only to find that although Juice would still compile, install and run, it was silently crapping out while trying to read feeds so it would not actually update my podcasts. I reverted back to the version of UFP bundled with Juice and everything was fine (except for the UFP bugs I was hoping to have solved by using a later version, of course).
I should update this as things progress.


