Discussion:
[Sisuite-devel] systemimager.org down
Andrea Righi
2009-01-12 08:51:23 UTC
Permalink
Greg,

systemimager.ci.uchicago.edu seem down, responding to ping and telnet,
but nothing else.

When you have a minute could you try to reset the server?

Many thanks for your time,
-Andrea
Brian Elliott Finley
2009-01-12 19:08:02 UTC
Permalink
Thanks, Ti,

Will do.

-Brian
future instead of Greg or I individually. Thanks.
Post by Andrea Righi
Greg,
systemimager.ci.uchicago.edu seem down, responding to ping and telnet,
but nothing else.
When you have a minute could you try to reset the server?
Many thanks for your time,
-Andrea
--
Brian Elliott Finley
CIS / Argonne National Laboratory
Office: 630.252.4742
Mobile: 630.631.6621
Andrea Righi
2009-01-12 20:22:51 UTC
Permalink
Post by Brian Elliott Finley
Thanks, Ti,
Will do.
-Brian
future instead of Greg or I individually. Thanks.
Post by Andrea Righi
Greg,
systemimager.ci.uchicago.edu seem down, responding to ping and telnet,
but nothing else.
When you have a minute could you try to reset the server?
Many thanks for your time,
-Andrea
Thanks Ti,

everything's working fine now. We'll write to the support list next time.

For the other admins/developers (Brian, Bernard, ..): in addition to
the check-oom.pl script I've configured the kernel with:
kernel.panic = 60
vm.panic_on_oom = 2

In case of future OOMs (not prevented by the script) the system will
compulsorily panic and reboot after 60 sec. Hopefully this will
finally save all the possible hangs due to OOM.

-Andrea
Brian Elliott Finley
2009-01-12 21:30:48 UTC
Permalink
Excellent idea, Andrea.

Thanks, -Brian
Post by Andrea Righi
Post by Brian Elliott Finley
Thanks, Ti,
Will do.
-Brian
future instead of Greg or I individually. Thanks.
Post by Andrea Righi
Greg,
systemimager.ci.uchicago.edu seem down, responding to ping and telnet,
but nothing else.
When you have a minute could you try to reset the server?
Many thanks for your time,
-Andrea
Thanks Ti,
everything's working fine now. We'll write to the support list next time.
For the other admins/developers (Brian, Bernard, ..): in addition to
kernel.panic = 60
vm.panic_on_oom = 2
In case of future OOMs (not prevented by the script) the system will
compulsorily panic and reboot after 60 sec. Hopefully this will
finally save all the possible hangs due to OOM.
-Andrea
--
Brian Elliott Finley
CIS / Argonne National Laboratory
Office: 630.252.4742
Mobile: 630.631.6621
Loading...