Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve License Clarity at Top Package Level #3792

Draft
wants to merge 36 commits into
base: develop
Choose a base branch
from

Conversation

swastkk
Copy link
Collaborator

@swastkk swastkk commented Jun 1, 2024

Fixes #3802

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)
    Run tests locally to check for errors.

Signed-off-by: swastkk swastkk@gmail.com

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swastkk you are regenerating all the tests with useless churn. This adds more things to review and is not ideal. Let's do things differently. You can delete your last commit and push regenerated test fixtures for only the tests which were failing.

Signed-off-by: swastik <swastkk@gmail.com>
@AyanSinhaMahapatra
Copy link
Member

@swastkk this is not correct atm:

  1. why use license_clarity and not license_clarity_score? This is what we use on the summery option.
  2. We want this new attribute license_clarity_score added to top-level packages if and only if the --package-summary option is used, and not in every package like you have here.

This new plugin could be there in packagedcode/plugin_package.py possibly, as we need to check if this option is enabled or not in process_codebase step for the package plugin, and then this should be passed on below to package.to_dict() fucntion for the same.

Signed-off-by: swastik <swastkk@gmail.com>
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sure the license_clarity_score attribute is added only on using the --summary-package command line option? No. Look at your test regenerations, none of these tests have this option enabled, still have this attribute added.

You have to pass the package_summary option like we have the package_only CLI option here: https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/plugin_package.py#L203, then further pass it down to create_package_and_deps at https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/plugin_package.py#L263 and further to package.to_dict() at https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/plugin_package.py#L367 to actually be able to correctly set the attribute package_summary at https://github.com/swastkk/scancode-toolkit/blob/improve-license-clarity/src/packagedcode/models.py#L1544 you added thorugh this PR. Otherwise it's always set to one value.

…mary as Postscan Plugin

Signed-off-by: swastik <swastkk@gmail.com>
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More changes required.
Please merge latest develop afterwards.

tests/packagedcode/data/plugin/help.txt Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/models.py Outdated Show resolved Hide resolved
src/packagedcode/models.py Outdated Show resolved Hide resolved
tests/scancode/data/help/help.txt Outdated Show resolved Hide resolved
…inor changes

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above. And sup on populating license clarity and other attributes?

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more nits.

tests/packagedcode/test_plugin_package.py Outdated Show resolved Hide resolved
tests/scancode/data/help/help.txt Outdated Show resolved Hide resolved
src/packagedcode/models.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
Signed-off-by: swastik <swastkk@gmail.com>
…arity_score nexB#3817

Signed-off-by: swastik <swastkk@gmail.com>
@swastkk swastkk linked an issue Jul 4, 2024 that may be closed by this pull request
…e(Without other_license_detections)

Signed-off-by: swastik <swastkk@gmail.com>
@swastkk swastkk self-assigned this Jul 5, 2024
…date test, nexB#3802

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
…esources nexB#1395

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
…e get_field_values_from_resources nexB#3287 nexB#1395

Signed-off-by: swastik <swastkk@gmail.com>
…s well package level nexB#3287 nexB#1395

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
…bute class & instance nexB#3862

Signed-off-by: swastik <swastkk@gmail.com>
@swastkk swastkk linked an issue Jul 17, 2024 that may be closed by this pull request
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See initial comments: We need some major refactoring.

Also please do not include multiple empty files for test when only one is fine. For example you have multiple empty files under tests/packagedcode/data/package_summary/change-case-change-case-5.4.4.zip-extract/change-case-change-case-5.4.4/packages/change-case/src/. Also add some license expressions in the source files (not all files, only one) to check whether other_license_expression is correctly populated.

src/summarycode/score.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
src/packagedcode/plugin_package.py Outdated Show resolved Hide resolved
values.append(value)

if is_codebase:
for resource in resources.walk(topdown=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for resource in resources.walk(topdown=True):
for resource in codebase.walk(topdown=True):

This is wrong and extremely confusing code. You cannot store a codebase object in a resources variable and walk/get attributes from it depending on a flag value. You should only pass resources here from before, and these should always be resource objects consistently. See other comments on this.

for value in getattr(resource, field_name, []) or []:
values.append(value)

if is_codebase:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not make this function complex, we should instead, do a codebase.get_resource(path=resource_path) to get the resource objects for a specific path string (make sure you test these for the strip_root CLI option).

We should not do two things which are basically the same, depending om a flag value, but we should try to make the inputs same instead, so that same code is valid for both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is True everywhere else, you are using resources to carry both codebase objects and resource dictionaries. This is never a good thing, make sure you only carry a list of resource objects and the codebase object seperately (some cases this is needed for cetain functions) in both cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this, I didn't understand why we will be doing the codebase.get_resource(path=resource_path), as we are using a single func compute_license_score and inside it we were calling the get_field_values_from_resources which can be renamed, as for the package-summary, we were passing the Complete Package with all its resources comibined together with resource field, shouldn't it be a better approach to use a flag for getting field values for the Package Object that already contains resources with it and normal field_value collecttion from the complete codebase for the overall summary for the codebase.

Like using package_level_summary flag in the compute_license_score func or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand why we will be doing the codebase.get_resource(path=resource_path)

Because in one case we are using a resource mapping, in another case we are using resource objects, this should not be the case, it should always be obejcts.

See also the following comment:

you are using resources to carry both codebase objects and resource dictionaries.

This is not okay. We should instead modify and always use a list of resources (for a specific package, or alternatively all the resources in the codebase) passed in the function.

…_package_resources to get package resources

Signed-off-by: swastik <swastkk@gmail.com>
Signed-off-by: swastik <swastkk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants