Equipment/Landin: Difference between revisions
m (Updated location details) |
m (→Documentation) |
||
Line 42: | Line 42: | ||
== Documentation == | == Documentation == | ||
* [[File:HS-1235T-ATX_Quick_Reference_Sheet.pdf]] - Xyratex HS-1235T Kontron Motherboard Quick Reference Sheet (Slot speeds, etc.) | * [[File:HS-1235T-ATX_Quick_Reference_Sheet.pdf]] - Xyratex HS-1235T Kontron Motherboard Quick Reference Sheet (Slot speeds, etc.) | ||
* [[File:User_Manual_12big_Rack_Storage_Server_EN.pdf]] - | * [[File:User_Manual_12big_Rack_Storage_Server_EN.pdf]] - System User Manual | ||
* [[File:Quick_Install_Guide_12big_Rack_Storage_Server_EN.pdf]] - Xyratex HS-1235T Quick Install Guide (LaCie Branding) | * [[File:Quick_Install_Guide_12big_Rack_Storage_Server_EN.pdf]] - Xyratex HS-1235T Quick Install Guide (LaCie Branding) | ||
Revision as of 23:58, 9 January 2019
Multi-use server server for services in Ujima House
The system was named after the British computer scientist Peter Landin who was instrumental in using lambda calculus to model a programming language, leading to functional programming.
Please do not install anything directly on Landin (Make a VM)
Info
- IP: 10.0.20.10
- DNS: landin.london.hackspace.org.uk
- Access: LDAP
Stats
Landin is a Xyratex HS-1235T (OEM storage server platform for IBM XIV, Dell Compellent, LaCie 12Big, Pure FA-300, and several others others - compatibility with various branded disk trays such as NetApp DS4243 and other Xyratex OEM customers mentioned above fit in the array as well)
Note that the power button is just to the inside-front-left (just around the corner from the front-facing LED status lights)
- OEM Kontron Server Motherboard (similar to KTC5520 but without PCI Slot and Sound Card)
- 2 Six-core Xeon E5645 processors @ 2.4ghz
- 96 GB ECC Memory
- Sun MicroSystems ATLS1QGE Quad Port Gigabit Adapter LP Network Card
- Dual 120GB Western Digital Green SSDs (Software RAID-1)
- Avago LSI SAS2008 SAS PCIe JBOD Controller with the following ZFS disk configuration:
- 12-drive (1TB HGST HUA721010KLA330) single-pool RAIDZ2 (10TB usable) mounted as /peter
Documentation
- File:HS-1235T-ATX Quick Reference Sheet.pdf - Xyratex HS-1235T Kontron Motherboard Quick Reference Sheet (Slot speeds, etc.)
- File:User Manual 12big Rack Storage Server EN.pdf - System User Manual
- File:Quick Install Guide 12big Rack Storage Server EN.pdf - Xyratex HS-1235T Quick Install Guide (LaCie Branding)
Build Notes
- These are the notes for the build of Landin (and its functional twin Blanton)
- HW config and notes here: https://wiki.london.hackspace.org.uk/view/Equipment/Landin
Do the right thing and install the Software RAID-1 on the two boot SSDs. Install Notes here SSD install note: NO SWAP PARTITION (we've got 96GB of memory and the SSDs are only 120GB - make a swapfile if we really need on the ZFS array)
Note with the above, grub-install fails, so:
- fdisk /dev/sda (and then sdb)
- Add in a second partition that is at the front of the drives, change new partition 2 to type 4 (BIOS BOOT)
- Then chroot /target /bin/bash and grub-install /dev/sda and grub-install /dev/sdb (assuming these are the SSDs being mirrored)
- Now system works with grub installs, reboots, etc.
FYI - sda (and similarly sdb) will look like this:
Disk /dev/sda: 111.8 GiB, 120040980480 bytes, 234455040 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: Device Start End Sectors Size Type /dev/sda1 2048 234452991 234450944 111.8G Linux RAID /dev/sda2 34 2047 2014 1007K BIOS boot
Debian packages to install (support for either legacy multi-user commands, compilation stuff, and more):
Please note you should add "contrib non-free" after main to the /etc/apt/sources.list for ZFS!
iotop htop sudo finger bsdgames ethtool* lynx elinks net-tools openssh-server sudo screen iproute resolvconf build-essential tcpdump vlan ethtool rsync git rdist bzip2 git-core less unzip curl flex bc bison netcat nmap locate vim zsh vim-scripts zfs-dkms zfsutils-linux nfs-kernel-server samba-common-bin qemu-kvm libvirt-clients libvirt-daemon-system libvirt-daemon lshw ipmitool tftpd-hpa apt-mirror smartmontools iozone3 minicom tmux mosh silversearcher-ag
Show off dmesg
Why can superusers only look at dmesg nowadays? It's kinda useful to see (yeah, OK, fine, everything is a security risk) sudo sysctl kernel.dmesg_restrict=0 kernel.dmesg_restrict = 0 NOTE ABOVE - PUT IN /etc/sysctl.conf to make it permanent.
Installing ZFS, Setting up ZPOOL and Live Disk Swapping
Already setup above in the mega-apt-get command. (Legacy note) Please note you may need to add contrib (and possibly non-free) to the /etc/apt/sources.list (!)
apt-get install linux-headers-$(uname -r) apt-get install zfs-dkms zfsutils-linux
- EASY WAY TO MAKE THE ZPOOL (NOTE WHETHER YOU WANT RAIDZ1/Z2/Z3 and the WORKING DIRECTORY)
- Note you're using -f because you're using the whole disk and ignoring legacy disklabels...
cd /dev/disk/by-id sudo zpool create -f peter raidz2 `ls ata-HITACHI*|grep -v part`
(this is easy because all of the donated 1TB drives are same-model HITACHI)
- FYI - a Similar pool creation expanded out would look like this
sudo zpool create -f kinnaman raidz2 /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAG06BGA /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAG06EWA /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAG0DJ9A /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ93TMF /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9ES2F /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9GPHF /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9J1EF /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9J59F /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9N1AF /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9N2TF /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PAJ9N3EF /dev/disk/by-id/ata-HITACHI_HUA721010KLA330_PBJ76D4F
Proxmox setup
We installed Debian Stretch (Debian 9.4.0 at the time) and then followed the Install Proxmox VE on Debian Stretch documentation. After that we needed to install the upgraded the ZFS ZED Daemon via apt-get and upgrade our zpool version as well.
- We'll probably just edit LDAP users to be in that group rather than complicate things with local-remote overlays!
- libvirt:x:113: and libvirt-qemu:x:64055:
- Remember to add LDAP users to libvirt group using inheritance connectivity (or we just make the LDAP group be the 'auth'd group'
- Installed apt-mirror and sync'd the archive from RML's server
- rsync'd various items from RichmondMakerLabs mirror seed, updated /etc/apt/mirror.list with same URLs and updated local disk hierarchy.
TODO
- PHYSICAL: Move card to proper guaranteed x8 slots and confirm they are negotiating at full 5GT/s (SAS2008 and Sun Quad GBE) - crontab zpool scrub (weekly) - enable mail sending for daemon support and monitoring x install latest sas2ircu (https://www.broadcom.com/products/storage/host-bus-adapters/sas-9210-8i#downloads) for mgmt - install sas2ircu-status (from somewhere else) (Not needed?) x install bay-identifier.sh - label drive trays with last 4 or 6 serial number chunks (maybe not needed) x Play with sas2ircu to see if we can get drives in certain bays to flash (useful finding failed drives to replace) - configure smartd and other warning devices (smartd is dumb when drives get swapped - please note!) - integrate into Hackspace infra (automatic emails, root ssh keys, etc.) - Find rails to mount into - Configure LACP for 4xGbE Sun Interface - Export NFS to certain systems over LACP link?i - Configure ZeD for automation - /etc/zfs/zed.d/zed.rc Good notes here: http://louwrentius.com/the-zfs-event-daemon-on-linux.html - Enable tftpd-hpa for TFTP booting of phones and PXE systems, etc. x Enable apt mirroring for Local Debian/Ubuntu installs - Documentation for generating VMs - Mirroring latest Debian OS for VM installs x Add MOTD:
Welcome to LANDIN.london.hackpsace.org.uk (Debian 9) NmmdhhhhhhhhhhdmmN This system provides: mmhhhhhhhhhhhhhhhhhhhhmm VM hosting for infra & test NmdhhhhhhhhhhhhhhhhhhhhhhhhdmN - ACNODE -ADMINSTUFF -BRUCE -CHOMSKY mhhhhhhhhhhhhh/``/hhhhhhhhhhhhhm NFS / TFTP / SMB / OpenLDAP Ndhhhhhhhhhhhh/` `/hhhhhhhhhhhhdN ZFS Volumes & Replication Nhhhhhhhhhhhh/` ohhhhhhhhhhhhhN Ndhhhhhhhhhhss. .shhhhhhhhhhhdN Please use CHOMSKY for your general dhhhhhhhhh/` .os. `` .syhhhhhhhhhd system needs. hhhhhhhh/` .ssy/ `/hhhhhhhh hhhhhh/` .s/ `/hhhhhh hhhhhh/` -o. `/hhhhhh To create a new VM: hhhhhhhh/` -oss. `/hhhhhhhh 1. Make Dabley Dobbles. dhhhhhhhhhys. `` .os. `/hhhhhhhhhd 2. Bootle lambda frogs Ndhhhhhhhhhhhs. .sshhhhhhhhhhdN 3. Baz barrybondaboo Nhhhhhhhhhhhhho `/hhhhhhhhhhhhN 4. Edit the wiki with info Ndhhhhhhhhhhhh/` `/hhhhhhhhhhhhdN mhhhhhhhhhhhhh/``/hhhhhhhhhhhhhm NmdhhhhhhhhhhhhhhhhhhhhhhhhdmN mmhhhhhhhhhhhhhhhhhhhhmm NmmdhhhhhhhhhhdmmN
Storage Pools
As above, one single RAIDZ2 pool of old 1TB 7200 RPM drives known as zpool 'peter'
Networks
- bond0 LACP group of 4 gigabit ethernet interfaces, tagged with VLANs
Bridges
- vmbr0 - Standard Linux Bridge, bridged to bond0.20. Think of it like an internal switch. Any VM attached to this bridge is effectively attached to the Servers VLAN
- vmbr1 - Standard Linux Bridge, bridged to bond0.30. This is for the cctv network - you probably don't want this one!
- vmbr2 - Standard Linux Bridge, bridged to bond0.10. This is for the management network - you probably don't want this one!
Current VMs
Chomsky
Chomsky is a General purpose system for LHS member usage ( IRC client use, Robonaut, shell interaction, http://hack.rs/ URL & forwards, light programming tasks, etc.).
- If you are a current London Hackspace member and would like to login to Chomsky, please create and enable your LDAP login here.
- Once your LDAP login has been created, use your ssh client and account details to connect to chomsky.hack.rs. (We also resolve internally to chomsky.lan.london.hackspace.org.uk)
- If you have a software package you'd like installed on the system, please engage with any of the maintainers via IRC or the mailing list and we'll do the best to accommodate you.
ACserver
Adminstuff
- Adminstuff serves network admin bits that were original on retired physical host denning, now running Ansible, apt-cacher-ng, tftpboot + pxeboot stuff, NFS server for diskless booting via Netboot.
apt-cacher-ng
Should you want to leverage our local cache for installing the latest Debian/Ubuntu/Raspbian, you can leverage our local proxy.
Simply specify http://adminstuff.lan.london.hackspace.org.uk:3142 or have the line
Acquire::http::Proxy "http://adminstuff.lan.london.hackspace.org.uk:3142/";
in something like /etc/apt/apt.conf.d/proxy.conf
Remember to delete this file if you take your computer off of the Hackspace network!
Redmine
Icinga 2
Services
- apt-mirror / apt-cacher-ng (we probably only want one of these)
- TFTP Serving for PXE Boot Support
Scheduled Services
We use our reasonably equipped data storage and bandwidth to our advantage, especially when synchronising new Ubuntu and Debian variants.
- apt-mirror syncing at 4AM every morning the following Debian and Debian-derived repositories:
Debian Unstable main contrib non-free Debian Stable main contrib non-free Debian Stretch main contrib non-free Ubuntu 16.04 main restricted universe multiverse UBUNTU 18.04 main restricted universe multiverse Raspbian jessie main contrib non-free rpi Raspbian stretch main contrib non-free rpi
- ZFS Scrubbing for Data Health & Verification
How to:
Create a new VM
Via the web interface
- Go to to https://landin.lan.london.hackspace.org.uk:8006
- Login with your LDAP credentials
- Click Create VM in the top right corner
- In the general tab, click advanced in the lower right corner and then set the name and check "start at boot"
- In the OS tab, select your desired ISO image in the drop down list and configure the parameters for the guest OS
- In the Storage tab, select a SCSI device, select the storage to the "peter" zpool and entered your desired disk size. Check advanced and also check the "discard" box (Important for thin provisioning)
- In the CPU tab, select your desired number of cores and sockets
- In the memory tab, select your desired size for the RAM
- In the Network tab, select "vmbr0" for the bridge and set the model to "VirtIO"
- In the Confirm tab, check "start after created" and click finish
Via CLI
Note: We should probably create a wrapper script to make this easier, to enforce naming conventions, run Ansible, and other devops-esque stuff
- First of all SSH into Landin. Your users will have to have the appropriate permissions to create a VM
- Find an available "ID". Lets try and keep them contiguous:
qm list
- View available ISOs (Or upload your ISO to the same directory)
ls /var/lib/vz/template/iso
- Create the VM
qm create [ID] --name [NAME] -cdrom [PATH TO ISO] --memory [RAM] --cores [CORES] --net0 [INTERFACE] --scsi0 [LOCATION,SIZE]
- Example of a Debian VM with a single core, 512MB of RAM, 10G HDD and connected to the "Bridge" interface
qm create 104 --name "qm-test" --cdrom /var/lib/vz/template/iso/debian-9.4.0-amd64-netinst.iso --memory 512 --cores 1 --net0 "virtio,bridge=vmbr0" --scsi0 "file=peter:10,discard=on,size=10G"
From an existing disk image
Create a VM from the cli or web as above, no need to start it. Then delete it's disk from the hardware config.
Then follow this: http://dae.me/blog/2340/how-to-add-an-existing-virtual-disk-to-proxmox/
If the old vm image is stored on ZFS then you'll need to set the disk cache used by proxmox to `writeback`
Once the disk appears in the proxmox UI you can add it to the vm and activate it (? Can't quite remember how I did it, but the cache thing is the main thing to know)
Notes
There is an apt-cacher-ng setup on landin running on port 10.0.20.10:3142
Netbooting should work now, the tftp server and files are on the adminstuff VM. There is a Debian Stretch installer with a preseed config that sets up ssh keys for root for some of the admins.
RAID Status and How to Blink a Light and Replace a Drive =
Thankfully the system is not in the middle of a woodshop, but the batch of Hitachi 1TB drives are pretty old and we should expect disk failures to happen. This is an overview of tools available to diagnose the health of the array.
How is the ZFS Zpool Health, How is the Hardware Health
- Very likely you want to see how ZFS sees the drives. This command should suffice:
# zpool status -v
- You can check the list of hardware connected to the array via the LSI (Avago/Broadcom) utility sas2ircu
# sas2ircu 0 display
(you'll want to pipe this to less or a text file to scroll through the various notes.
- Maybe you want to run through smartctl and see whether any of the disks are in a pre-fail state. Try a shell script like this:
for i in {a..o}; do echo "Disk sd$i" $SN $MD smartctl -i -A /dev/sd$i |grep -E "^ "5"|^"197"|^"198"|"FAILING_NOW"|"SERIAL"" done
ZFS Disk Death - what to do
If a 1 or 2 disks die in the ZFS zpool, you'll want to replace them. You'll see something like a disk or two with the status UNAVAIL and the zpool state being DEGRADED. We don't want to shut off the computer, so what to do?
- Make note of the disk ID(s) and search for those drives by doing
# sas2ircu 0 display | less
- While scrolling up and down using less, you can find the affected dying drive serial number (starts with the letter P in our Hitachi examples)
- Make a note of the enclosure number and the slot number on the controller in the command above.
- Make the affected disk(s) blink in their slots if you have enclosures that blink properly. DON'T JUST CUT AND PASTE THIS COMMAND AND REPLACE THE WRONG DRIVE BECAUSE YOU MADE THE WRONG SLOT BLINK. This example below shows blinking drive 1 in assembly 2:
# sas2ircu 0 locate 2:1 on
- then you'll see the blinking slot(s) and can remove those affected disks.
- Replace the drives in the disk trays (you may need a Torx T10 driver or a careful flathead screwdriver to replace drives in the tray, and then reinsert.
- Turn the blinking light off.
# sas2ircu 0 locate 2:1 off
- Find the new drive by either seeing the latest drive added in dmesg and then poking around /dev/disk/by-id for the right serial number. Example disk replacement (remember, use zpool status to find the old disk to replace)
# zpool replace -f peter ata-HITACHI_HUA721010KLA330_PAJ9N3EF ata-HITACHI_HUA721010KLA330_PBJ7DNWE
You can then run
zpool status -v
to see the replacement in progress and a time estimation to finish replacing the old drive in the ZFS array. Nice!