Vendor addition
From a vendor perspective I think you are all correct It actual comes down to what the render workload is and what the network IO is as a ratio of the CPU
workload.
Some additional pts.
1.
RDMA can function on both Infiniband and Ethernet (its called RoCE on Ethernet). In fact a new RoCE version has recently come out allowing it to
operate on L3 Ethernet making it routable (ie. Routable RoCE)
2.
RDMA/RoCE will virtual remove all of your network overhead on your render CPUs (its a direct memory access) as it complete bypassed the kernel and
TCP
3.
RDMA/RoCE is already baked into loads of Upper layer protocols (ULPs) NFS, SMB (called SMB-Direct), iSCSI (called iSER), GPFS, SRP (SCSI over RDMA)
and others. Its also native in the main-stream OSs inc. both Linux and Windows there are a few switches you need but its relatively easy to use
4.
The entry point is 10GbE but obviously the benefits increase as you go to higher speeds as the delta to traditional TCP increases etc. It also interoperates
so you can run standard TCP and RoCE over the same network at the same time which drives a single-network and reduces cost especially as a single network could potentially mean no FC etc.
5.
With Mellanox the network is agnostic and also the speed (its called VPI virtual protocol interconnect) so you can run RDMA/RoCE on both IB and
Ethernet and up to 56Gb/56GbE. Yes I know 56GbE Ethernet is proprietary but its in the product set so you might as well use the extra 40% horsepower
6.
You can also use commodity hardware take a look at PixitMedia their solution is entirely built on commodity hardware components and can scale
to 30 simultaneous streams of EXR 25fps 16-bit 4K (4x3) sure you wont need this storage horsepower for render but it does scale-out linearly and cost effectively. Having a single name-space for the lot makes things a lot easier for people.
In my opinion go with what you need. If you a building out a render-farm that doesnt have or need high network IO go with 10GbE and use RoCE this will
give you CPU cycles back for no extra cost. If your render does have high network IO go with something more powerful Its all workload dependent. Horses for courses as they say!
Rich.
PS. Tin-hat and flame suit are on. :-)
From: studiosysadmins-discuss-bounces@studiosysadmins.com [mailto:studiosysadmins-discuss-bounces@studiosysadmins.com]
On Behalf Of Nick Allevato
Sent: 22 September 2014 05:10
To: studiosysadmins-discuss@studiosysadmins.com
Subject: Re: [SSA-Discuss] iWARP RDMA Solution for Superfast NFS-RDMA Rendering
This seems like a heated topic already; I like it.
I think I had a dirty dream about this once.
It's like Open Compute 56.0
On Sun, Sep 21, 2014 at 9:40 AM, Saker Klippsten <sakerk@gmail.com> wrote:
This is the next big thing guys. |
|
Yes the goal is to speed things up and also scale them up while maintaining that speed up all on a cost factor that the project can handle so that you end up with a profit. Not a loss. Which most projects these days seem to be trending
towards.
What is the renderer / OS that you are testing? Software plays the largest factor in overall render time .
It's one thing to transfer the dependencies around a cluster at X speed but there is still x amount of raw compute time to render the frame. If you are chopping a frame up into buckets and have 10 nodes render a single frame there is going
to be a speed advantage for the lookdev stage. If you are going to render a sequence its going to take just about the same time.
If I can use a GPU render with say 8 of them in a chassis I don't have to worry about scaling as much and even then the cost to transfer dependencies is small until the raw compute time is below that of the network IO which it's very rarely
is. Right now CPU/GPU takes longer than to transfer the scene file and textures for most projects. Any decent 10GB network should be fine.
$ IMOP you should be speeding up the render for GPU and CPU and looking at power costs for that giant cluster ;)
| Yes mellanox vs chelsio and at 56G. To reach 56G one must utilize infiniband and a certain set of adapters. That is the core concept behind really really big render time savings. One could go over ethernet in windows 2012 or Linux Debian/Redhat, but you
need to be very sure that the PCI-x cards support. This is the next big thing guys. I guess many of the ramp ups across this nation and overseas (hardware procurement) probably did not take anything like this into account. Too bad for those responsible sysadmins.....
;) Cause it looks like this stuff is speeding things up quite a bit. You can go 10G if you buy the right PCI-xadapter cards. But its also a bios thing as I have been reading up. Luckily we have ramped up over 800,000 USD in newest HP hardware here. So I believe the Z820's and Z840's Bios supports this nextgen technology.
There are only a handful of IB cards and they are new so don't go buy used equipment for these tests, and there are also only a handful of Ethernet 10G adapter cards ready and waiting. In my dealings with renderfarms, the golden rule is to speed things up, right? Well if you can bump rendetimes down from say 90mins/fr to 9mins/fr, you are doing your job. FYI - Also try and always build you own storage too ;) Thats the easy part (and cheaper part). Jorg Mohnen, M.Sc. MBA | |
|
To unsubscribe from the list send a blank e-mail to mailto:studiosysadmins-discuss-request@studiosysadmins.com?subject=unsubscribe
--
Nicolas Allevato, Ops, 5th Kind