Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SCTK package_only in inspect_packages pipeline #1118

Merged
merged 4 commits into from
Mar 25, 2024
Merged

Conversation

AyanSinhaMahapatra
Copy link
Contributor

@AyanSinhaMahapatra AyanSinhaMahapatra commented Mar 13, 2024

This PR adds the following changes in inspect_packages pipeline:

  • The key update is using the newly added package_only attribute in the get_package_data API which is equivalent to the new --package-only CLI option in SCTK. This is an optional attribute, which is set True, only scans package manifests for package data, and skips license and copyright detected on extracted license statements and copyrights, hence making this step faster.
  • Also adds a couple updates related to the restructured inspect_packages pipeline introduced in Restructure pipelines for verbosity #1074:
    • Creation of packages/dependencies are moved to a separate pipe in the pipeline so we have info on time taken to scan for package data, and to create them seperately, to enable comparision with the previous pipeline.
    • Only creates packages if purl is present, else skip package creation. (since we are not doing package assembly, we cannot determine whether failure to create a package is an issue, so the messages were noise)
    • Skips creating packages from package handlers of type: models.NonAssemblableDatafileHandler (like autotools configure scripts for example: https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/build.py#L50C33-L50C68)

Copy link
Contributor

@tdruez tdruez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AyanSinhaMahapatra What's the latest status on a SCTK release? Also, could you add a changelog entry for those changes?

@tdruez
Copy link
Contributor

tdruez commented Mar 25, 2024

@AyanSinhaMahapatra please provide context and changelog for those changes.
"Update inspect_packages pipeline" is not explicit.

* Split package/dependencies creation in a seperate step
* Only create packages/dependencies from Assemblable PackageData

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Support the new only_packages attributes in scancode
get_package_data API, to only scan for package data and
skip license and copyright detection.

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra AyanSinhaMahapatra changed the title Update inspect_packages pipeline Use SCTK package_only in inspect_packages pipeline Mar 25, 2024
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra
Copy link
Contributor Author

@tdruez I've rebased the changes on the latest main, where we have the new SCTK release v32.1.0, and I've also added more context for the changes in PR description above and added a CHANGELOG entry, this is ready to review now. Thanks!

Reference: #1087
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test_scanpipe_resolve_dependencies_pipeline_integration_misc takes over 2min to run
2 participants