---------- Forwarded message ----------
From: Tomer Nussbaum <tomer.nussbaum@mail.huji.ac.il>
Date: Thu, Nov 23, 2017 at 6:08 PM
Subject: Cluster best configurations for YT
To: yt-users@lists.spacepope.org



Hi,

We have updated our cluster lately, but our YT platform works very slow (loads one snapshot an hour, when couple of people run together...).
I wanted to ask you if you can help solving this issue from your experience.

We use infiniband, and we see that the main problem is lots of request in random access to our hard drives,
so the problem can be in fine tuning YT on the Nodes, or fine tuning the file system


This brings up this couple of issues (maybe more, if you have another idea, I would be thankful to know):
  1. IO readings - Is there a way to optimize the io readings requests to the file server?
  2. YT configurations - Are there specific parameters in the YT configuration for this status? 
  3. File server behavior - Can we set reading server functions (we use hard drives, serial reading status)?
  4. Metadata cache - Does using metadata cache will solve the issue? 
  5. zlib usage - How can I check if this feature is activated, how much is it important in the YT platform? 

I will really appreciate any help with it,
Thanx,
Tomer