Solr monitoring on chkservd
Hi,
I've mad this config to check solr service on my server:
#!/bin/bash
if netstat -tuln | grep -q ':8983'; then
exit 0
else
exit 1
fi
The script works well, with exit 0, but chkserv, with this config:
service[solr]=/usr/local/bin/check_solr.sh,/usr/local/bin/restart_solr.sh,,/usr/local/bin/restart_solr.sh,solr,1,0,0,0,1
Fails to check it:
solr [[check command:-][socket connect:N/A][fail count:4]Restarting solr....
And restart it (well done, on the other hand). What can I do to solve this?
Thanks a lot.
-
Hey there! Is there a reason that the standard monitoring tool in WHM >> Service Manager isn't working for you? Once Solr is installed you'll see cpanel-dovecot-solr in the list of things that can be monitored by Service Manager, and that would be included in cPanel without any additional customizations needed.
0 -
Hi!
We use solr as standalone because we use for a e-commerce listing, so unfortunately the dovectot-solr it's not useful for us :(
0 -
Thanks for the additional details. I can't say for sure why that custom script wouldn't be working for Solr installed outside of cPanel, unfortunately, as that isn't something we support.
0 -
Hi,
I understand that, but, how can I make the script to chkservd recognize it at running process? If "exit 0" it's not recognizable by chkservd , what output can I use?
0 -
There are several different operating codes that could be used - have you checked out guide here for examples of various calls?
0 -
Hi,
Yes, I've tried with:
service[solr]=x,x,x,/usr/local/bin/restart_solr.sh,solr,root
But I get this on the chkservd log:
solr [[check command:-][socket connect:N/A][fail count:2]Restarting solr....
I don't know why don't detect it, because solr process it's running (by solr user). The restart process it's well done, the problem it's detecting that solr process it's alive (I've double checked the name with ps aux)
0 -
What is the full output from "ps aux | grep -i solr" on that machine?
0 -
Hi!
Its this:
solr 2479045 4.4 2.7 22608800 7339852 ? Sl Dec18 63:23 java -server -Xms6144m -Xmx6144m -XX:+UseG1GC -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent -Xlog:gc*:file=/opt/solr/server/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M -Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes= -DzkClientTimeout=30000 -DzkRun -Dsolr.log.dir=/opt/solr/server/logs -Djetty.port=8983 -DSTOP.PORT=7983-DSTOP.KEY=solrrocks -Duser.timezone=UTC -XX:-OmitStackTraceInFastThrow -XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983 /opt/solr/server/logs -Djetty.home=/opt/solr/server -Dsolr.solr.home=/opt/solr/server/solr -Dsolr.data.home= -Dsolr.install.dir=/opt/solr -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf -Xss256k -Dsolr.httpclient.builder.factory=org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory -Dsolr.httpclient.config=/opt/solr/server/solr/basicAuth.conf -Dsolr.log.muteconsole -jar start.jar --module=http --module=gzip
Thank you,
0 -
Thanks for that - that's exactly what I needed.
Since "solr" is the username and not the name of the process itself the script is triggering the restart since it doesn't see a "solr" process.
Is there a certain port the service uses that you could monitor instead?
0 -
Thanks to you!
Maybe can we monitor "java -server" for example? It's the only process on this server with this name
Yes, the process uses the port 8983
Thank you,
0 -
Sure - if this is the only java-server process that would work, or you could check and see that something is running on 8983 if the service is always listening there, like how Apache and Exim do.
0 -
Hi,
I've changed the service to java, this way
service[java]=x,x,x,/usr/local/bin/restart_solr.sh,solr,root
However, again, it's down for chkservd :(
I've tried also with port, this way:
service[solr]=8983,GET /solr/admin/info/system HTTP/1.0\r\nHost: localhost,,/usr/local/bin/restart_solr.sh,solr,root
and this way:
echo -e "GET /solr/admin/info/system HTTP/1.0\r\nHost: localhost\r\nAuthorization: Basic XXX\r\n\r\n" | nc localhost 8983
However, both ways, it's down for chkservd.
I've tried to check on terminal the response, and it's 200 OK
echo -e "GET /solr/admin/info/system HTTP/1.0\r\nHost: localhost\r\nAuthorization: Basic XXX\r\n\r\n" | nc localhost 8983
Thanks again,
0 -
I'm honestly not sure why none of those are working properly, as that first "java" one you tried should work as expected.
You're always welcome to create a ticket so this can be examined directly on the affected system.
0
Please sign in to leave a comment.
Comments
13 comments