Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Remove validate_pycapsule#1426

Open
Tpt wants to merge 1 commit intoapache:mainfrom
Tpt:tpt/validate_pycapsule
Open

Remove validate_pycapsule#1426
Tpt wants to merge 1 commit intoapache:mainfrom
Tpt:tpt/validate_pycapsule

Conversation

@Tpt
Copy link

@Tpt Tpt commented Mar 13, 2026

Rationale for this change

The Bound<'_, PyCapsule>::pointer_checked method does the same validation and is already used across the codebase

What changes are included in this PR?

  • remove validate_pycapsule
  • remove some not useful .map_err(py_datafusion_err)?
  • replace c_str! calls by c-literals (now supported by MSRV)

Are there any user-facing changes?

Some error messages have changfed

@Tpt Tpt force-pushed the tpt/validate_pycapsule branch 2 times, most recently from 2698aad to 71c0273 Compare March 13, 2026 16:49
The Bound<'_, PyCapsule>::pointer_checked does the same validation and is already used across the codebase
@Tpt Tpt marked this pull request as ready for review March 13, 2026 17:20
@Tpt Tpt force-pushed the tpt/validate_pycapsule branch from 71c0273 to 38d5e6b Compare March 13, 2026 17:20
Comment on lines -50 to -61
if capsule_name.is_none() {
return Err(PyValueError::new_err(format!(
"Expected {name} PyCapsule to have name set."
)));
}

let capsule_name = unsafe { capsule_name.unwrap().as_cstr().to_str()? };
if capsule_name != name {
return Err(PyValueError::new_err(format!(
"Expected name '{name}' in PyCapsule, instead got '{capsule_name}'"
)));
}
Copy link
Contributor

@kevinjqliu kevinjqliu Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We lose the error messages by removing this function. Would this be potentially confusing to end users?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The error messages from PyCapsule_GetPointer are ValueError("PyCapsule_GetPointer called with invalid PyCapsule object") and ValueError("PyCapsule_GetPointer called with incorrect name"). These error only happen if the "magic" methods return invalid capsules so they should only be encountered by people trying to implement them and not regular DataFusion users. Also, the stack trace should give a hint of which capsule is raising the error.

An option might be to wrap the error in a nicer message. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the error messages from PyCapsule_GetPointer! They do seem fairly similar 😄

I still think it could be helpful to expose validate_pycapsule as a public function for implementers. While PyCapsule_GetPointer errors will mostly be encountered by developers implementing the “magic” methods, having a helper with clearer validation and error messages could make debugging easier. Since https://github.com/apache/datafusion-python/pull/1414/files#diff-6dc4fc730855f068c83d61b03b141832727bb30bc71788f763761ef8cd2bde7eR154 introduces a new datafusion-python-util crate that includes this helper, exposing it there could provide implementers with a simple way to validate capsules before extracting pointers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out the error messages from PyCapsule_GetPointer! They do seem fairly similar 😄

I am also contributing to PyO3. It made me move deep in some CPython rabbit hole.

having a helper with clearer validation and error messages could make debugging easier

If we want nice helper what about abstracting as much as possible the PyCapsule details to get an API like:

#[pymethods]
impl MyTableProvider {
    pub fn __datafusion_table_provider__<'py>(
        &self,
        py: Python<'py>,
        session: PyDataFusionCapsule<Arc<dyn LogicalExtensionCodec>>,
    ) -> PyResult<PyDataFusionCapsule<Arc<dyn TableProvider>>> {
    }
}

If you want I can give it a try after #1414 is merged and in the meantime close this MR. My hope it that it could work by playing around with some traits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants