-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
polars performance with simple array arithmetic much slower than NumPy? #18088
Comments
This is quite similar to #17414. Interesting to see that this is happening on AMD, however. |
We looked into it further and have narrowed it 100% down to our use of #[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
fn sub_one(x: &[f64]) -> Vec<f64> {
x.iter().map(|x| *x - 1.0).collect()
}
fn main() {
let v = vec![0.0; 1_000_000];
let start = std::time::Instant::now();
for _ in 0..10_000 {
std::hint::black_box(sub_one(std::hint::black_box(&v)));
}
dbg!(start.elapsed() / 10_000);
} We will have to figure out if this is something we can resolve by configuring |
Ok, we figured out that the slowdown is caused by page faults that the default Setting the Strangely enough, setting |
Can people try setting |
Description
I was working on a histogram implementation for polars in Positron and I stumbled on surprising performance differences for operations with float64 arrays:
I haven't looked too deeply, but I am on a AMD Ryzen 7 PRO 7840U which has avx2/avx512 extensions, but almost a 10x difference in performance surprised me. I will just convert polars arrays to NumPy arrays and do the work there, but I would be interested to diagnose the problem further to develop an intuition when I should write polars code vs. dropping down to NumPy when doing numerical operations
The text was updated successfully, but these errors were encountered: