Discussion:
[Sisuite-devel] [Fwd: systemimager.ci.uchicago.edu down]
Andrea Righi
2009-02-08 13:53:37 UTC
Permalink
Dear ***@uchicago,

sorry to bug you again, but the server never came back again and it's
still down. News? Any chance to manually reboot it?

Thanks,
-Andrea

-------- Original Message --------
Subject: systemimager.ci.uchicago.edu down
Date: Thu, 22 Jan 2009 09:54:13 +0100
From: Andrea Righi <***@gmail.com>
To: ***@ci.uchicago.edu
CC: Brian Elliott Finley <***@anl.gov>, Bernard Li <***@vanhpc.org>, sisuite-dev <sisuite-***@lists.sourceforge.net>

Dear ***@uchicago,

the host systemimager.ci.uchicago.edu seems down (not ping-able nor
telnet-able).

In these past days we've had a lot of out-of-memory problems. Now
we've configured the server to prevent OOM conditions (using a script
that restarts apache when the memory is getting low) and in case the
OOM can't be prevented the kernel automatically reboots after a OOM
trace. Unfortunately this doesn't seem enough...

Please, could you check if the server is down due to another reason
(not OOM, I mean, if there's a console which is the message on the
screen?) and try to manually reboot it?

Many thanks,
-Andrea
--
Andrea Righi,
PhD student
Department of Information Engineering
Universita' degli Studi di Siena
Via Roma, 56 - 53100 Siena (Italy)
Brian E. Finley
2009-02-08 19:24:41 UTC
Permalink
Ti,

The search is in progress. Thanks for your patience in the meantime.

I think I have a machine identified, and hope we can get it ready to swap
out within 2 weeks.

-Brian



----- Original Message -----
From: Ti Leggett <***@ci.uchicago.edu>
To: CI Support <***@ci.uchicago.edu>
Cc: Brian Elliott Finley <***@anl.gov>; Bernard Li <***@vanhpc.org>;
Andrea Righi <***@gmail.com>; Andrea CAPRIOTTI
<***@cineca.it>; Matteo CHESI <***@cineca.it>; sisuite-dev
<sisuite-***@lists.sourceforge.net>
Sent: Sun Feb 08 13:10:18 2009
Subject: Re: [Fwd: systemimager.ci.uchicago.edu down]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David, can you reboot systemimager.ci when you get in on Monday? It's
the HP right near con.

Brian, it may be time to look into replacement hardware for this
machine.
Post by Andrea Righi
sorry to bug you again, but the server never came back again and it's
still down. News? Any chance to manually reboot it?
Thanks,
-Andrea
-------- Original Message --------
Subject: systemimager.ci.uchicago.edu down
Date: Thu, 22 Jan 2009 09:54:13 +0100
the host systemimager.ci.uchicago.edu seems down (not ping-able nor
telnet-able).
In these past days we've had a lot of out-of-memory problems. Now
we've configured the server to prevent OOM conditions (using a script
that restarts apache when the memory is getting low) and in case the
OOM can't be prevented the kernel automatically reboots after a OOM
trace. Unfortunately this doesn't seem enough...
Please, could you check if the server is down due to another reason
(not OOM, I mean, if there's a console which is the message on the
screen?) and try to manually reboot it?
Many thanks,
-Andrea
--
Andrea Righi,
PhD student
Department of Information Engineering
Universita' degli Studi di Siena
Via Roma, 56 - 53100 Siena (Italy)
Brian E. Finley
2009-02-08 20:24:40 UTC
Permalink
Ok, Thanks,

-Brian



----- Original Message -----
From: Ti Leggett <***@ci.uchicago.edu>
To: Brian E. Finley <***@anl.gov>
Cc: ***@ci.uchicago.edu <***@ci.uchicago.edu>; ***@vanhpc.org
<***@vanhpc.org>; ***@gmail.com <***@gmail.com>;
***@cineca.it <***@cineca.it>; ***@cineca.it
<***@cineca.it>; sisuite-***@lists.sourceforge.net
<sisuite-***@lists.sourceforge.net>
Sent: Sun Feb 08 13:25:05 2009
Subject: Re: [Fwd: systemimager.ci.uchicago.edu down]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Note that it needs to be a 1U machine.
Post by Brian E. Finley
Ti,
The search is in progress. Thanks for your patience in the meantime.
I think I have a machine identified, and hope we can get it ready to
swap
out within 2 weeks.
-Brian
----- Original Message -----
Post by Andrea Righi
;
Sent: Sun Feb 08 13:10:18 2009
Subject: Re: [Fwd: systemimager.ci.uchicago.edu down]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
David, can you reboot systemimager.ci when you get in on Monday? It's
the HP right near con.
Brian, it may be time to look into replacement hardware for this
machine.
Post by Andrea Righi
sorry to bug you again, but the server never came back again and it's
still down. News? Any chance to manually reboot it?
Thanks,
-Andrea
-------- Original Message --------
Subject: systemimager.ci.uchicago.edu down
Date: Thu, 22 Jan 2009 09:54:13 +0100
the host systemimager.ci.uchicago.edu seems down (not ping-able nor
telnet-able).
In these past days we've had a lot of out-of-memory problems. Now
we've configured the server to prevent OOM conditions (using a script
that restarts apache when the memory is getting low) and in case the
OOM can't be prevented the kernel automatically reboots after a OOM
trace. Unfortunately this doesn't seem enough...
Please, could you check if the server is down due to another reason
(not OOM, I mean, if there's a console which is the message on the
screen?) and try to manually reboot it?
Many thanks,
-Andrea
--
Andrea Righi,
PhD student
Department of Information Engineering
Universita' degli Studi di Siena
Via Roma, 56 - 53100 Siena (Italy)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
iEYEARECAAYFAkmPLhoACgkQ4RgdOxQVi0Ch2ACffboFxogZPVyaQfCJcC6CMW4P
/18An0lKbJsvl7LItfovkcgn8CbfEtjL
=BuRk
-----END PGP SIGNATURE-----
Loading...