Late last week the instance of Jenkins we are running at the office on a Mac Mini started to reject incoming connections from all other devices on our LAN.
http://localhost:8080 still resolved fine and allowed to open up the Jenkins web interface while all requests to
http://[ip]:8080 for that matter) would give a Connection Refused error. In the browser the web interface just would not load without giving a proper error message. Only a request through
wget unearthed the Connection Refused error being the root cause.
This happened out of the blue and was neither triggered by an update of Jenkins itself nor the OS. Neither was anything changed in the network infrastructure. Weirdly enough, one we ran into the error on our dedicated build server we run trails on all our other machines: it happened on all machines running OS X with Jenkins 1.5xx revisions installed (I did not test earlier revisions) even with Jenkins installed freshly on a vanilla system.
After days of investigating all possible software (restored a time machine backup from back when the system was still working) and hardware (i.e. network infrastructure related) causes we just gave up and moved onto a work around.
When you install jenkins on OS X through Homebrew it installs a LaunchAgent and asks you to use it to launch Jenkins with
launchctl – OS X's builtin way to run and start programs and scripts during various phases of the system boot and login sequence.
We have used this way to start Jenkins on our dedicated Mac Mini for a little over a year now without issue. Until last week that was when the ominous Connection Refused error started.
Research on the web finally turned up this lead. Pretty desperate at the time I went with it, despite the fact that the OS and way to launch Jenkins in this poor guy's case were completely unrelated. At least the error he was getting was the same as ours.
And, in fact, it turned out that Jenkins would respond fine to requests coming from LAN-devices when launch directly using the Java interpreter. After a couple of more hours trying to figure out why this was and going down the rabbit whole of network debugging and tracing tools and the weirdness of the OS X Firewall service I finally decided to hack together a workaround.
nohup this line would turn out to run Jenkins reliably without any Out of Memory exceptions I experienced on the first couple of tries:
nohup java -XX:PermSize=512M -XX:MaxPermSize=2048M -Xmn128M -Xms1024M -Xmx2048M -jar /usr/local/opt/jenkins/libexec/jenkins.war
I put together a wrapper script to manage jenkins as kind of a service:
This would allow me to start and stop Jenkins keeping track of the process and the output it generates.
Last step was to make OS X run
jenkins.sh start when booting up. This brought me round to
launchctl again – and, sure enough, things fell apart again at that point. Run through a LaunchAgent using my own wrapper the same Connection Refused error blocked outside connection attempts to Jenkin's web interface.
I finally put the call to the script to start Jenkins into an Automator-generated application which I installed and set up as a Launch Item in the User's OS X System Preferences.
This finally did the trick and our Jenkins-powered CI and deployment solution was up and running again. Half the time of my work week well wasted – and I still have no inkling what the root cause might be.