Replies: 1 comment 2 replies
-
It depends. On a whole range of factors:
One gotcha re: TDB2 and RAM Usage. TDB2 relies on memory mapping the database files into the OS file cache to maximise performance. A common mistake by users is to set the JVM heap size really high which leaves the JVM competing with the OS for memory and tanking performance as the memory mapped files get swapped in and out of memory. Rule of thumb is to set a low JVM heap (2-4GB typically) and leave the rest of the memory free for the OS to cache the memory mapped database files. If running in a containerised environment then make sure the memory allocated to the container accounts for both these sources of memory usage and you are explicitly setting the JVM memory so it doesn't just grab its default (which is usually 1/4 of the total memory reported by the OS)
Maybe, the only real way to know is to benchmark on representative data and queries for your use case. Some tools for this exist, e.g. my own sparql-query-bm, and lots of "standard" benchmarks exist (LUBM, SP2B etc) but they often aren't representative of real world use cases.
There's several options that have been used/put together by folks associated with the Jena project in the past:
|
Beta Was this translation helpful? Give feedback.
-
Hi there (again),
I'm trying to answer the following question: "how many users (or a mix of simple and complex queries) can be handled by Jena/TDB2 simultaneously before performance starts to become an issue".
Now, TDB2 seems to be limited to a single virtual/physical machine and the question above is rather vague. Let's assume RAM isnt limited and the graph is not anywhere near the limits of TDB2 could store (maybe up to 10 million triple to be extremely conservative for my application).
Now the question for me is: can I employ Jena for simultaneous queries of (O-Notation):
O(100) users
O(1000) users
O(10000) users?
I did search for benchmarks but so far, they did not really answer my question (maybe just the wrong ones?). The people who work in our project are a little concerned that we cant scale horizontally later on. I would be grateful for hints/suggestions/answers :)
It seems there is also a project called the 'SANSA' stack (https://sansa-stack.net/) and id kind of builds on top of Jena (at least Jena modules appear quite often in their source code), but I did not get replies from the devs so far to what extent the package works already yet and what it does involve in terms of functionality.
Beta Was this translation helpful? Give feedback.
All reactions