What is the optimal number of records per shard?

Hello,

I am currently developing a PIR server using the pir-server-example repository.

We are anticipating a total of 10 million URLs for our dataset. In this context, what would be the optimal shard size (number of records per shard) to balance computational latency and communication overhead?

Any advice or best practices for handling a dataset of this scale would be greatly appreciated.

Thank you.

Thanks for taking the time to share your question here. Unfortunately, it hasn't received an answer yet. Here are a few suggestions that might help it attract more attention:

  • Provide more details: Expanding on your post to include any error messages, code snippets, steps you've already taken to troubleshoot, and the expected/actual outcomes would be very helpful.
  • Be specific about your technology stack: Clearly state the programming languages, frameworks, or tools you are using.
  • Check for duplicates: Before posting, make sure your question hasn't been asked before. You can use the search bar to find similar threads.

I'm sure someone in the community will be able to help once you have a chance to update your post.

Albert
  Worldwide Developer Relations.

What is the optimal number of records per shard?
 
 
Q