Archive:Equipment/Borg

From London Hackspace Wiki
(changed wording)
No edit summary
 
(73 intermediate revisions by 11 users not shown)
Line 1: Line 1:
Borg is a virtual host server to be used to give members access to virtual machines in the space and ultimately replace many if not all the functions of [[Babbage]].
{{EquipmentInfobox
|name=Borg <!-- Name of the item. -->
|image=BorgLHS.jpg <!-- Image of the item. Leave with placeholder image if none exists. -->
|model=IBM 3950 M2 <!-- Model -->
|category=Equipment <!-- Main category. Please leave alone to keep item in this category -->
|subcat=Defunct <!-- Sub-category if one exists. Please check main listing to see other categories contained within the main one -->
|status=Scrapped
|consumables=<!-- Any items used up in normal operation, such as; ink, paper, saw-blades, cutting disks, oil, etc.. -->
|accessories=Bits <!-- Any items associated with the equipment but not consumable, such as; drill bits, safety gloves, goggles, etc.. -->
|reqtraining=yes
|trainlink=<!-- If training is required, provide a link to training signup or contact page. Otherwise leave blank. -->
|acnode=no
|owner=Hackspace <!-- Provide a link to owners members page if other than LHS -->
|origin=Donation <!-- If via pledge, please link to the completed pledge page on the wiki -->
|location=Basement rack
|maintainers=Oni <!-- If someone is nominated as managing the upkeep of this item, please list them here. No links please; it currently breaks the template. -->
|template_ver=1.1 <!-- Please do not change. Used for tracking out-of-date templates -->
}}
 
'''SCRAPPED AT [[Ujima House]] in Wembley.'''
Borg was a stack of 4 linked machines that combine as one, resulting in half a terabyte of RAM and 64 Cores. It ran Linux designed for High Performance Computing on computationally expensive tasks.
 
=Status=
 
The BORG cluster was complete. The system was Ubuntu 12.04 LTS. We have 4 units in place (the maximum amount) with linux 12.04LTS installed with the usual hackspace credentials.
 
There are still some tasks to perform before the system is hackspace friendly. See the '''todo''' list below.


= Naming =
= Naming =
We are naming this machine after [http://en.wikipedia.org/wiki/Anita_Borg Anita Borg].
We named this machine after [http://en.wikipedia.org/wiki/Anita_Borg Anita Borg].
Also the [http://en.memory-alpha.org/wiki/Borg Borg].
Also the [http://en.memory-alpha.org/wiki/Borg Borg].
They're currently labeled as BORG1 through BORG6


= Access =
= Access =
Currently the machine is not built. Please find Tgreer to get access to it.


= Specs =
Access is via ssh. This system requires physically plugging in the power cable and launching the complex through the RAS controller on Borg5
To be confirmed.
 
==Steps to boot the cluster==
 
* Firstly, locate the power cable. There is a blue, 16Amp socket on the wall. Nearby will be a power bar with a matching 16Amp plug. Attach this to the socket. (NEW: since now the plug is somehow seating very strongly, we unplug the primary cable from the power strip on the ground instead.) Borg should power up
* Each borg node has a dual powersupply, if all 8 PSU's are connected it may trip the breaker in the supply to the rack - if this happens the ups will start to beep. Switch the breaker back on if this happens!
* You can get away with only connecting one psu on each machine.
* The lights on borg should be FLASHING green. A slow flash on the power buttons. DO NOT press any of these buttons. All the machines need to be brought up via the RAS
* With a webbrowser on your laptop, attached to the hackspace network, navigate to  http://172.31.24.170 (was "http://172.31.24.171, I believe this is the right IP address :S")
* This is Borg5, the headnode - you should see a webpage asking for username and password. Username USERID password PASSW0RD (note the zero in password as oppose to the letter 'o')
* Navigate to the last item on the left - RAS Partitions.
* You should see the partition on the right as a block diagram. All the nodes should be connected and in Blue. Find a button marked something like "boot partition". Press it.
* All the borgs should start whirring, more lights will come on and the power lights will turn green
* Wait about 10 minutes
 
Borg now has a static ip and gets username and password info from LDAP. It also mounts home directorys from colin. All hackspace members can sudo on Borg. You should have access to all the CPUs and memory in all 4 borg machines.
 
= Specs, Layout and Status =
'''IBM 3950 M2'''
 
* 16 cores total
** 4 x Xeon 7330: 4 cores @ 2.4 GHz "Tigerton"  [http://ark.intel.com/products/30794/Intel-Xeon-Processor-E7330-6M-Cache-2_40-GHz-1066-MHz-FSB Intel Ark spec sheet]
* 128 GB ram
<br />
 
The list reflects the '''layout''' in the rack.
 
*BORG5 - 4 CPUs, 128GB RAM - SN 99C5979 - 1.16 BIOS - HEAD Node
*BORG3 - 4 CPUs, 128GB RAM - SN 99C5980 - 1.16 BIOS
*BORG6 - 4 CPUs, 128GB RAM - 1.16 BIOS
*BORG4 - 4 CPUs, 128GB RAM - SN 99B3501 - 1.16 BIOS
*<s>BORG1 - 3 CPUs, 8GB RAM</s>
*<s>BORG2 - 3 CPUs, 8GB RAM</s>
 
 
BORGs one and two are lower spec and cannot be linked to the cluster (max of 4 nodes) so should be cannibalised for spares and disposed of.


= IP =
= IP =
To be confirmed. We will most likely take an IP range so that we know those IPs are coming from a virtual machine.
172.31.24.12 - RAS
172.31.24.11 - head node
 
=RAS =
 
The RAS II is a separate control system that is setup in the bios and accessible as soon as a BORG unit has power (i.e, if the green light is flashing OR solid). Using a web browser head to either
*Borg3/Borg6 http://172.31.24.170
*Borg5 http://172.31.24.171
 
Username '''USERID''' password '''PASSW0RD'''
 
=ScaleXpandr=
 
In order to link upto 4 borgs together we need to use the special cables. [http://www.redbooks.ibm.com/redbooks/pdfs/sg247630.pdf http://www.redbooks.ibm.com/redbooks/pdfs/sg247630.pdf] - page 235 shows how this is done to create an SMP style set of nodes.


= Logbook =
= Logbook =
[[Borg/LogBook]]
[[Borg/LogBook]]


= Birth =
= Running Debian =
It is reasonable to say the birth of Borg has been long and difficult, the discussion has generally rolled around the 3 areas:
 
# Use
* Needs the non-free bnx2 firmware on a flash drive for the install to work, (could try to add it to the install initrd)
# Cost
* Something something IBM Calgary IOMMU something something leads to DMA errors and the LSI MegaRaid raid card dosn't work, booting with "iommu=soft" makes it work but may not be ideal. search https://www.kernel.org/doc/Documentation/x86/x86_64/boot-options.txt for iommu
# Space
* The incantation seems to be: iommu=soft,calgary megaraid_sas.msix_disable=1
Each have been [https://groups.google.com/d/topic/london-hack-space/Ntwmbs1xa2A/discussion discussed on the mailing list] but for summary.
 
== Use ==
== Upgrading the bios ==
Borg will have the following uses:
 
* Shell accounts for members
Do a diskless boot, the go do Debian and then "Jessie amd64 Diskless for BORGs", log in as root (password 'root', this diskless setup is for testing only!), then:
* Virtualising random machines in the space
 
* Processing of cams
cd ibm-bios/z/
./lflash64
 
This is an upgrade to bios version 1.16
 
We also need to upgrade:
 
The RSA II thing: https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5086633
The FPGA (in the scaleXpander?)
The BMC
 
We might be able to do it individually, or perhaps use the IBM UpdateXpress thing, which needs SUSE Linux Enterprise Server 11 x86-64 or Red Hat Enterprise Linux 6 x86-64.
 
We need to reset all the bios etc settings to there defaults.
 
== link dump ==
 
* https://bugs.launchpad.net/ubuntu/+source/linux/+bug/343749
* http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5083138
* http://dump.asiantuntijakaveri.fi/le_bueno_dumpo/lsi/
* https://wiki.debian.org/LinuxRaidForAdmins
* http://hwraid.le-vert.net/wiki/DebianPackages
 
=== good megaraid cli guide: ===
 
* http://hwraid.le-vert.net/wiki/LSIMegaRAIDSAS
* http://www.linuxsa.org.au/pipermail/linuxsa/2011-November/094772.html
 
=== reflashing?!?: ===
 
* http://blog.asiantuntijakaveri.fi/2013/09/reflashing-lsi-megaraid-sas-8708elp.html
 
= Usage =
 
== Energy Consumption ==
 
=== Meter in-line ===
 
Testing with a normal, common-o-garden multimeter inline with the plug, on all 4 BORG units we have:
 
* 1.28amps at 240v - standby
* 7.8 amps peak on start
* 5 to 6 amps normal operation
* 9.48 amps with all 64 processors at 100%
 
== Blender Tests==
 
Recent tests comparing Borg3 with AWS , rendering 30 Frames of a small Blender Scene with Blender 2.76 and maxing out the processors:
 
===AWS===
* $0.520 per On Demand Linux c1.xlarge Instance Hour at 12 hours
* Cost is roughly £4.60 at current exchange
* Spot instances are possible with Brenda or similar - at $0.07 the cost would be roughly 53p
 
===Borg3===
 
* £0.12p per Kilowatt Hour
* With all 16 Cores operating at 100% Borg was drawing 560Watts
* To render the same scene took 206minutes
* Total Cost 23p
 
Borg (being a server) could add substantially to our electricity costs. To help this we can offset some of it by the retiring of other machines in the space. It has also been agreed that if it takes up more than 275W regularly then a decision about Borgs future will have to be made. If it goes over 300W we will have to re-think our strategy. These figures are a maximum and equate to about £27 per month in electricity costs. The power consumption will be regularly measured and anyone is welcome to report power consumption levels.
 
= Location =
 
The Rack in the basement.


== Cost ==
= TODO =
Much of the cost will be wrapped up in the extra energy consumption. This can offset by the retiring of other machines in the space. It has also been agreed that if it takes up more than 275W regularly then a decision about Borgs future will have to be made. If it goes over 300W we will have to re-think our strategy. These figures are a maximum and equate to about £27 per month in electricity costs. The power consumption will be regularly measured and anyone is welcome to report power consumption issues.


== Space ==
* Work out whats up with the 4th missing raid controller - looks like one node does not share it's pci devices?!?
Taking up space in the main room or quiet room is clearly out of the question our 2 options are:
* Play with the raid management thingy
# Store cupboard/ex-toilet next to the darkroom/bio room.
* https://github.com/chicks-net/megamap
# Toilets
* Work out what disks we have and where they are and what they do
A decision will be made soon.
* sdc and sdd are now free. (was a vmware install for emf)
* Re-install with debian?
* <s>Get it under ansible (with nfs home dirs and ldap users)</s> done
* remove the FC cards we don't need. (or get a FC disk array!)
* channel bond some of the nics
* do something with the 10Gb Nic's?!?!?


= Potential uses =
= Potential uses =
Please list any machines/uses you think would benefit from virtualisation.
* Shell accounts (from Babbage)
* Cam processing (from Babbage)
* Cross-compiling environment (from Lovelace)


= Current VMs =
* Rendering video and 3D
Please keep this list up to date of virtual machines on Borg. If a machine is created but is not in this list it is liable to be switched off without warning.
* Bio-informatics number crunching (bio-hackers?)
{|
* simulation
! Name
* realtime ray-tracing
! Use
* Radio FFT decoding in real-time (Cubesat related)
! Owner
|-
| ExampleMachine
| For displaying an example item in this table. Please be descriptive.
| [[User/MrCotten|Cotten Eye Joe]]
|}

Latest revision as of 22:02, 24 May 2021

Borg
BorgLHS.jpg
Model IBM 3950 M2
Sub-category Defunct
Status Scrapped
Accessories Bits
Training requirement yes
ACnode no
Owner Hackspace
Origin Donation
Location Basement rack
Maintainers Oni

SCRAPPED AT Ujima House in Wembley. Borg was a stack of 4 linked machines that combine as one, resulting in half a terabyte of RAM and 64 Cores. It ran Linux designed for High Performance Computing on computationally expensive tasks.

Status

The BORG cluster was complete. The system was Ubuntu 12.04 LTS. We have 4 units in place (the maximum amount) with linux 12.04LTS installed with the usual hackspace credentials.

There are still some tasks to perform before the system is hackspace friendly. See the todo list below.

Naming

We named this machine after Anita Borg. Also the Borg.

They're currently labeled as BORG1 through BORG6

Access

Access is via ssh. This system requires physically plugging in the power cable and launching the complex through the RAS controller on Borg5

Steps to boot the cluster

  • Firstly, locate the power cable. There is a blue, 16Amp socket on the wall. Nearby will be a power bar with a matching 16Amp plug. Attach this to the socket. (NEW: since now the plug is somehow seating very strongly, we unplug the primary cable from the power strip on the ground instead.) Borg should power up
  • Each borg node has a dual powersupply, if all 8 PSU's are connected it may trip the breaker in the supply to the rack - if this happens the ups will start to beep. Switch the breaker back on if this happens!
  • You can get away with only connecting one psu on each machine.
  • The lights on borg should be FLASHING green. A slow flash on the power buttons. DO NOT press any of these buttons. All the machines need to be brought up via the RAS
  • With a webbrowser on your laptop, attached to the hackspace network, navigate to http://172.31.24.170 (was "http://172.31.24.171, I believe this is the right IP address :S")
  • This is Borg5, the headnode - you should see a webpage asking for username and password. Username USERID password PASSW0RD (note the zero in password as oppose to the letter 'o')
  • Navigate to the last item on the left - RAS Partitions.
  • You should see the partition on the right as a block diagram. All the nodes should be connected and in Blue. Find a button marked something like "boot partition". Press it.
  • All the borgs should start whirring, more lights will come on and the power lights will turn green
  • Wait about 10 minutes

Borg now has a static ip and gets username and password info from LDAP. It also mounts home directorys from colin. All hackspace members can sudo on Borg. You should have access to all the CPUs and memory in all 4 borg machines.

Specs, Layout and Status

IBM 3950 M2


The list reflects the layout in the rack.

  • BORG5 - 4 CPUs, 128GB RAM - SN 99C5979 - 1.16 BIOS - HEAD Node
  • BORG3 - 4 CPUs, 128GB RAM - SN 99C5980 - 1.16 BIOS
  • BORG6 - 4 CPUs, 128GB RAM - 1.16 BIOS
  • BORG4 - 4 CPUs, 128GB RAM - SN 99B3501 - 1.16 BIOS
  • BORG1 - 3 CPUs, 8GB RAM
  • BORG2 - 3 CPUs, 8GB RAM


BORGs one and two are lower spec and cannot be linked to the cluster (max of 4 nodes) so should be cannibalised for spares and disposed of.

IP

172.31.24.12 - RAS 172.31.24.11 - head node

RAS

The RAS II is a separate control system that is setup in the bios and accessible as soon as a BORG unit has power (i.e, if the green light is flashing OR solid). Using a web browser head to either

Username USERID password PASSW0RD

ScaleXpandr

In order to link upto 4 borgs together we need to use the special cables. http://www.redbooks.ibm.com/redbooks/pdfs/sg247630.pdf - page 235 shows how this is done to create an SMP style set of nodes.

Logbook

Borg/LogBook

Running Debian

  • Needs the non-free bnx2 firmware on a flash drive for the install to work, (could try to add it to the install initrd)
  • Something something IBM Calgary IOMMU something something leads to DMA errors and the LSI MegaRaid raid card dosn't work, booting with "iommu=soft" makes it work but may not be ideal. search https://www.kernel.org/doc/Documentation/x86/x86_64/boot-options.txt for iommu
  • The incantation seems to be: iommu=soft,calgary megaraid_sas.msix_disable=1

Upgrading the bios

Do a diskless boot, the go do Debian and then "Jessie amd64 Diskless for BORGs", log in as root (password 'root', this diskless setup is for testing only!), then:

cd ibm-bios/z/ ./lflash64

This is an upgrade to bios version 1.16

We also need to upgrade:

The RSA II thing: https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5086633 The FPGA (in the scaleXpander?) The BMC

We might be able to do it individually, or perhaps use the IBM UpdateXpress thing, which needs SUSE Linux Enterprise Server 11 x86-64 or Red Hat Enterprise Linux 6 x86-64.

We need to reset all the bios etc settings to there defaults.

link dump

good megaraid cli guide:

reflashing?!?:

Usage

Energy Consumption

Meter in-line

Testing with a normal, common-o-garden multimeter inline with the plug, on all 4 BORG units we have:

  • 1.28amps at 240v - standby
  • 7.8 amps peak on start
  • 5 to 6 amps normal operation
  • 9.48 amps with all 64 processors at 100%

Blender Tests

Recent tests comparing Borg3 with AWS , rendering 30 Frames of a small Blender Scene with Blender 2.76 and maxing out the processors:

AWS

  • $0.520 per On Demand Linux c1.xlarge Instance Hour at 12 hours
  • Cost is roughly £4.60 at current exchange
  • Spot instances are possible with Brenda or similar - at $0.07 the cost would be roughly 53p

Borg3

  • £0.12p per Kilowatt Hour
  • With all 16 Cores operating at 100% Borg was drawing 560Watts
  • To render the same scene took 206minutes
  • Total Cost 23p

Borg (being a server) could add substantially to our electricity costs. To help this we can offset some of it by the retiring of other machines in the space. It has also been agreed that if it takes up more than 275W regularly then a decision about Borgs future will have to be made. If it goes over 300W we will have to re-think our strategy. These figures are a maximum and equate to about £27 per month in electricity costs. The power consumption will be regularly measured and anyone is welcome to report power consumption levels.

Location

The Rack in the basement.

TODO

  • Work out whats up with the 4th missing raid controller - looks like one node does not share it's pci devices?!?
  • Play with the raid management thingy
  • https://github.com/chicks-net/megamap
  • Work out what disks we have and where they are and what they do
  • sdc and sdd are now free. (was a vmware install for emf)
  • Re-install with debian?
  • Get it under ansible (with nfs home dirs and ldap users) done
  • remove the FC cards we don't need. (or get a FC disk array!)
  • channel bond some of the nics
  • do something with the 10Gb Nic's?!?!?

Potential uses

  • Rendering video and 3D
  • Bio-informatics number crunching (bio-hackers?)
  • simulation
  • realtime ray-tracing
  • Radio FFT decoding in real-time (Cubesat related)

Showing 2 related entities.