Oct 232015
 

Hi, so you are here because you lost access to the management controller of your MSA2012. Not to worry, it happened to me too, after running OpenVAS security scan management interface was knocked off the network – I couldn’t access SSH let alone Web interface, couldn’t ping IP address. D’oh. But data was still being served so storage controllers were working just fine. How weird.

What I did is probably not recommended by HP or even forbidden – but some of us simply like living on the edge and experiment with stuff (and it was a good exercise to see if this kit is actually capable of failover and using multiple paths).

Besides, we have been using this particular “out of warranty” storage array for scratch data only so either short downtime or even possibility of losing it wouldn’t matter that much. But bear this in mind before you start pulling out live controller from that Payroll storage array, right? And obviously, please, do not try this with disk array equipped with a single controller only.

What you need is serial cable with mini-DB9 connector , you probably got it with your MSA2012 – the one that looks like that:

 

Hook it up to the storage array

and connect the other end to serial port on your server, preferably one that actually uses this block device (just so you can run tail -f /var/log/messages on other terminal and monitor how bad situation is).

Second element that we need is a piece of software, terminal emulator – both minicom and picocom should work and both are included in Debian and CentOS.

I used picocom, assuming your port is COM0 (because you might be using COM1 for DRAC console as described here):

yum install picocom
# or
apt-get install picocom
picocom -b 115200 /dev/ttyS0

Now, theoretically after hitting enter you should get MSA Management Controller (MC) prompt, starting with # sign. Mine was dead. This guide assumes that you cannot take it down completely and you are ready to take some risk. Also if you can, try to minimise I/O to array.

Now the the part HP won’t like, I unscrewed and removed controller B briefly, just for 10sec and then re-inserted it back into a slot. What happened, storage controller A took over operation as it should (so data access was *almost* uninterrupted) and controller B booted, bringing both management and storage services back online. With serial cable connected I was able to watch controller booting.

Once management controller B is online, wait ~2min so things settle down, then check status and if it is looking good restart controller A

show system
show configuration
restart mc A

In order to quit picocom

ctrl a 
ctrl x

 

After a short while you should have both Management controllers operational.

Tested with two HP MSA2012i (iSCSI) and one HP MSA2012sa (Direct Attach SAS) and it was smooth sailing all way down.

Now, time to find out why they went down in a first place, I suppose I’m a few firmware releases behind…


 

Sep 162015
 

Stolen from Reddit, source

1) Set up a KVM hypervisor.

2) Inside of that KVM hypervisor, install a Spacewalk server. Use CentOS 6 as the distro for all work below. (For bonus points, set up errata importation on the CentOS channels, so you can properly see security update advisory information.)

3) Create a VM to provide named and dhcpd service to your entire environment. Set up the dhcp daemon to use the Spacewalk server as the pxeboot machine (thus allowing you to use Cobbler to do unattended OS installs). Make sure that every forward zone you create has a reverse zone associated with it. Use something like “internal.virtnet” (but not “.local”) as your internal DNS zone.

4) Use that Spacewalk server to automatically (without touching it) install a new pair of OS instances, with which you will then create a Master/Master pair of LDAP servers. Make sure they register with the Spacewalk server. Do not allow anonymous bind, do not use unencrypted LDAP.

5) Reconfigure all 3 servers to use LDAP authentication.

konfucjusz

6) Create two new VMs, again unattendedly, which will then be Postgresql VMs. Use pgpool-II to set up master/master replication between them. Export the database from your Spacewalk server and import it into the new pgsql cluster. Reconfigure your Spacewalk instance to run off of that server.

7) Set up a Puppet Master. Plug it into the Spacewalk server for identifying the inventory it will need to work with. (Cheat and use ansible for deployment purposes, again plugging into the Spacewalk server.)

8) Deploy another VM. Install iscsitgt and nfs-kernel-server on it. Export a LUN and an NFS share.

9) Deploy another VM. Install bakula on it, using the postgresql cluster to store its database. Register each machine on it, storing to flatfile. Store the bakula VM’s image on the iscsi LUN, and every other machine on the NFS share.

10) Deploy two more VMs. These will have httpd (Apache2) on them. Leave essentially default for now.

11) Deploy two more VMs. These will have tomcat on them. Use JBoss Cache to replicate the session caches between them. Use the httpd servers as the frontends for this. The application you will run is JBoss Wiki.

12) You guessed right, deploy another VM. This will do iptables-based NAT/round-robin loadbalancing between the two httpd servers.

13) Deploy another VM. On this VM, install postfix. Set it up to use a gmail account to allow you to have it send emails, and receive messages only from your internal network.

14) Deploy another VM. On this VM, set up a Nagios server. Have it use snmp to monitor the communication state of every relevant service involved above. This means doing a “is the right port open” check, and a “I got the right kind of response” check and “We still have filesystem space free” check.

15) Deploy another VM. On this VM, set up a syslog daemon to listen to every other server’s input. Reconfigure each other server to send their logging output to various files on the syslog server. (For extra credit, set up logstash or kibana or greylog to parse those logs.)

16) Document every last step you did in getting to this point in your brand new Wiki.

17) Now go back and create Puppet Manifests to ensure that every last one of these machines is authenticating to the LDAP servers, registered to the Spacewalk server, and backed up by the bakula server.

18) Now go back, reference your documents, and set up a Puppet Razor profile that hooks into each of these things to allow you to recreate, from scratch, each individual server.

19) Destroy every secondary machine you’ve created and use the above profile to recreate them, joining them to the clusters as needed.

20) Bonus exercise: create three more VMs. A CentOS 5, 6, and 7 machine. On each of these machines, set them up to allow you to create custom RPMs and import them into the Spacewalk server instance. Ensure your Puppet configurations work for all three and produce like-for-like behaviors.

Do these things and you will be fully exposed to every aspect of Linux Enterprise systems administration. Do them well and you will have the technical expertise required to seek “Senior” roles. If you go whole-hog crash-course full-time it with no other means of income, I would expect it would take between 3 and 6 months to go from “I think I’m good with computers” to achieving all of these — assuming you’re not afraid of IRC and google (and have neither friends nor family …).