AMD GPU setup on Ubuntu 20.04 for mining cryptocurrency
I outlined my base setup for a new server in a previous post, but it won't get your system ready for GPU mining. For that, you need to get the drivers sorted out, install OpenCL, and have an easy to use tool for setting clock speeds.
Don't get me wrong, I do love HiveOS for dedicated mining systems. It's a well-crafted system/environment that has:
all the driver stuff sorted out
pre-installed mining software for dozens of coins
remote management in the cloud with a really slick user-interface.
some nice CLI tools for real-time system monitoring
While it's not free, it's so cheap that there's no reason not to use it for your mining rigs.
There are always corner cases, though, aren't there?
I recently purchased an older Dell PowerEdge R720 off Ebay. For less than $400, you can get a server with dual Xeon V2 CPU's, PSU's, and from 16GB of RAM and up (mine came with 128GB). The primary role this device is playing on my network is to host backup VM's for important network services, namely DNS and a second node in my Aruba Virtual Mobility Controller cluster. Even though I suspect I could install kernel virtualization on HiveOS and run my VM's on it, I feel a lot better running a mainline Linux distribution on my server as the GPU mining is just a side-benefit, not the main purpose for the box.
It turns out there are some 6 or 7 PCIe slots in this little 2U server. 2 of which are half-height and would work for a 10G Ethernet interface, but not a GPU. There are a pair of PCIe slots that will work, and if you get the right GPU (most desktop GPU's are too big to fit), you can get two GPU's into this server.
Now I happen to prefer AMD GPU's, primarily because there is no nVidia Limited Hash-Rate (LHR) bullshit to deal with, and even though the newer AMD RX 6800 XT's are some of the most efficient Hash/Watt cards you can buy, the premium price they carry isn't worth it to me. My GPU of choice is the RX 5700 XT (or RX 5700) because:
it's easy to flash and tune (very few brands are locked, unlike the RX 5600 XT),
it's reasonably efficient (in the neighborhood of .43GHs/W on ETH)
is 25% less CapEx intensive on a $/MHs basis (which means faster payback on investment)
There are two different cards (that I know of) that will fit in the Dell R720, and I currently have one of each stuffed in this box.
AMD RX 5700 Reference Card aka Blower-Style
These cards will sit in the chassis with the blower intake facing down (see server image above), which will conveniently suck in airflow from a nearby cavity and push it straight out the back of the server. This is really ideally suited for the way airflow is designed to work in the R720 (and similar rack server chassis) and would be my go-to choice if you they could be found consistently.
2. Dell OEM RX 5700
You would think that because these cards are sold by Dell that they would be perfectly suited for an R720 chassis, but they really aren't. The fan that sits closest to the rear of the chassis sits on top of either the power supplies or another bulkhead and can't really pull air from anywhere (or push it anywhere). Also, unlike the blower-style card, the pin-hole vents out the back are not intended to be the primary exhaust for the card. I am struggling to keep this card cool enough, even with the 6 chassis fans running at 6,000 RPM.
The reason these cards fit is 100% due to the profile at the top of the card setting roughly level with the screw mount. I would call this a "low-profile" card, but I don't want to confuse you with the "card width", which also could be an issue with most graphics cards, but isn't the main thing I'm looking at. It's mainly about that card height, and the cards I've listed above barely fit into this server.
It's not too hard to guess which card is which from the output below. The Dell OEM is practically burning up and is thermal throttling.
GPU CUs CoreMHz MemMHz TEdge TJct TMem FanPct FanRpm VDDC Power 0 36 1275 875 64C 72C 82C 88.63% 4405 762 mV 110 W 1 40 1190 875 79C 85C 104C 88.63% 3382 737 mV 97 W
I may have to pull that Dell OEM card and re-pad it in the hopes of cooling it down a bit, and the SoC MHz is also high (not depicted in the above table), and if I can try and bring that speed down a bit it may cool down a bit more (as of right now, I can't determine how to set the Soc MHz with rocm-smi). Even with the thermal throttling, it's still getting 51-53MH/s.
Both of these cards are in fairly short supply, but since GPU mining with only 2 cards per server doesn't really scale too well so I doubt this fact will cause problems for too many people...
Okay, so are you ready to get the details on how to get this shit working?
After many failed attempts, this is the magical link that finally got me what I needed.
Follow the directions, and after that you should be able to read and set your overclock settings, and your mining software will find your GPU's and you're good.
Unlike HiveOS, which has a daemon that monitors the settings written by the UI regularly and tunes your cards, you'll have to run some commands yourself. What I do is put the setting right into my bash startup scripts for the mining software like this (example for Teamreadminer), but you could put them in your system startup scripts as well if you wanted.
#!/bin/sh # These environment variables should be set to for the driver to allow max mem allocation from the gpu(s). export GPU_MAX_ALLOC_PERCENT=100 export GPU_SINGLE_ALLOC_PERCENT=100 export GPU_MAX_HEAP_SIZE=100 export GPU_USE_SYNC_OBJECTS=1 # set the fan speed to about 90% (values are from 0-255) /usr/bin/rocm-smi --setfan 228 # set the GPU clock to a max of 1275 MHz /usr/bin/rocm-smi --setsrange 875 1275 # no need to set the Mem Clock, default at 875MHz # no need to set Voltage, PowerPlay tables do their job # show settings (purely optional) /usr/bin/rocm-smi # launch miner ./teamredminer -a ethash --eth_config=B \ -o stratum+ssl://us2.ethermine.org:5555 \ -u 0x4477D9Dd17524209d208a3792Ba0854c61BF0a1E.siegelgroupdonation \ -p x
Here's what happens after making those settings and kicking off mining.
# rocm-smi == ROCm System Management Interface ==================== ========= Concise Info ================================= GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 0 64.0c 108.0W 1275Mhz 875Mhz 88.63% manual 150.0W 97% 99% 1 78.0c 97.0W 1195Mhz 875Mhz 88.63% manual 180.0W 97% 99% ==================================================== ====== End of ROCm SMI Log =========================
The process is pretty similar with nVidia cards, but in my experience it's much easier. Just download the nVidia drivers from their website, the application nvidia-smi is installed and ready to go. I couldn't tell you why it's such a pain in the neck with AMD on Linux, but hopefully this guide will save you some time.
------------------- GPU Status --------------------------------------- GPU 0 [64C, fan 88%] ethash: 53.78Mh/s, avg 53.76Mh/s, pool 54.43Mh/s GPU 1 [79C, fan 88%] ethash: 52.62Mh/s, avg 51.41Mh/s, pool 52.47Mh/s Total ethash: 106.4Mh/s, avg 105.2Mh/s, pool 106.9Mh/s
Another handy command on these Dell servers is to install ipmitool so you can get chassis sensor data:
# ipmitool sdr list full Fan1 | 5640 RPM | ok Fan2 | 6240 RPM | ok Fan3 | 5640 RPM | ok Fan4 | 6000 RPM | ok Fan5 | 6480 RPM | ok Fan6 | 6240 RPM | ok Inlet Temp | 22 degrees C | ok Exhaust Temp | 49 degrees C | ok Temp | 77 degrees C | ok Temp | 71 degrees C | ok Current 1 | 1.20 Amps | ok Current 2 | 1.20 Amps | ok Voltage 1 | 238 Volts | ok Voltage 2 | 238 Volts | ok Pwr Consumption | 560 Watts | ok
That's 560 Watts with two GPU's mining ethereum and also both Xeon's mining AVN (formerly Ravencoin Lite). CPU mining consumes around 200W of the usage (yes, I'm losing money), so the pair of GPU's are pulling 265W, which is roughly 410kHs/W. Note that power usage will be a little higher at the wall, but the Dual 750W Dell PSU's are platinum rated, so we should only be seeing a 3-5% delta at the wall vs. the system reading.