This will outline some troubleshooting steps to take when OpenNMS is refusing to start. In this scenario, the service looked like the following, before eventually stopping:
OpenNMS.Eventd : start_pending OpenNMS.Trapd : start_pending OpenNMS.Queued : start_pending OpenNMS.Actiond : start_pending OpenNMS.Notifd : start_pending OpenNMS.Scriptd : start_pending OpenNMS.Rtcd : start_pending OpenNMS.Pollerd : start_pending OpenNMS.PollerBackEnd : start_pending OpenNMS.Ticketer : start_pending OpenNMS.Collectd : start_pending OpenNMS.Discovery : start_pending OpenNMS.Vacuumd : start_pending OpenNMS.EventTranslator: start_pending OpenNMS.PassiveStatusd : start_pending OpenNMS.Statsd : start_pending OpenNMS.Provisiond : start_pending OpenNMS.Reportd : start_pending OpenNMS.Alarmd : start_pending OpenNMS.Ackd : start_pending OpenNMS.JettyServer : start_pending opennms is partially running
[00:11:56]-> service opennms status Could not connect to 127.0.0.1 on port 8181 (OpenNMS might not be running or could be starting up or shutting down): Connection refused opennms is stopped
With the web server rendering:
Service Temporarily Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.
A key step is going to be analysing the daemon logs to assess any potential issues, which can be found in: /var/log/opennms/daemon/ (tail -f /var/log/opennms/daemon/* to show them in real-time, as you start the service in another terminal).
You may see some exceptions like the following, but they are commonly only shown as a symptom of another problem. AKA. it is unable to call stop because the service could never start:
2014-06-15 00:19:29,166 DEBUG [Main] Invoker: Invoking stop on object OpenNMS:Name=Vacuumd 2014-06-15 00:19:29,172 ERROR [Main] Invoker: An error occurred invoking operation stop on MBean OpenNMS:Name=Vacuumd: javax.management.RuntimeMBeanException: java.lang.NullPointerException javax.management.RuntimeMBeanException: java.lang.NullPointerException
2014-06-15 00:19:29,231 DEBUG [Main] Manager: Thread dump completed. 2014-06-15 00:19:29,232 DEBUG [Main] Manager: memory usage (free/used/total/max allowed): 47401760/116815072/164216832/1200160768 2014-06-15 00:19:29,232 INFO [Main] Manager: calling System.exit(1)
An error occurred while attempting to start the "OpenNMS:Name=Notifd" service (class org.opennms.netmgt.notifd.jmx.Notifd). Shutting down and exiting. javax.management.RuntimeMBeanException: java.lang.reflect.UndeclaredThrowableException at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839)
However, this is indicative of a startup problem and a common reason for this is broken config files. A good step would be to have a close look at any files yourself or others have modified and perform a check on them. Else, you can parse the XML for syntax errors using xmllint:
xmllint --noout /etc/opennms/*xml
At least in my case, this showed that there was a missing “<" at the beginning of a file I had been working on, but had somehow accidentally removed this character.
notifd-configuration.xml:1: parser error : Start tag expected, ‘<' not found ?xml version="1.0" encoding="UTF-8"?> ^
Fix this up and perform another service start on OpenNMS. It can take a while to start but you can keep checking the status to see if the service remains active until it eventually gets into a permanent working state.