-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GraphBolt][CUDA] sampled_edge_ids
to fetch indices later.
#7664
Conversation
To trigger regression tests:
|
@@ -573,23 +573,28 @@ def _convert_to_sampled_subgraph( | |||
indices = C_sampled_subgraph.indices | |||
type_per_edge = C_sampled_subgraph.type_per_edge | |||
column = C_sampled_subgraph.original_column_node_ids | |||
original_edge_ids = C_sampled_subgraph.original_edge_ids | |||
sampled_edge_ids = C_sampled_subgraph.original_edge_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also rename the original_edge_ids in C_sampled_subgraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#7703 I updated the doc string to clarify. It is indeed hard to rename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sampled_edge_ids -> edge_ids_in_fused_csc_sampling_graph.
@@ -45,6 +45,8 @@ class SampledSubgraphImpl(SampledSubgraph): | |||
] = None | |||
original_row_node_ids: Union[Dict[str, torch.Tensor], torch.Tensor] = None | |||
original_edge_ids: Union[Dict[str, torch.Tensor], torch.Tensor] = None | |||
# Used to fetch sampled_csc.indices if it is missing. | |||
_sampled_edge_ids: Union[Dict[str, torch.Tensor], torch.Tensor] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: _edge_ids_in_fused_csc_sampling_graph
The name has to be clear since it is indeed non-intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert pair.indptr is not None and isinstance( | ||
pair.indptr, torch.Tensor | ||
), "Node pair should be have indptr of type torch.Tensor." | ||
# For CUDA, indices may be None because it will be fetched later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering how much performance we can gain by asynchronously fetch indices, esp, we fetch original_edge_ids in the sync way.
Please make balance on performance gain and code structure complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25% is a lot so we needed to do it.
Description
We need to make a distinction between
sampled_edge_ids
andoriginal_edge_ids
.sampled_edge_ids
will be specifically used to fetch indices later as indices tensor does not need permutation with theORIGINAL_EDGE_ID
.Checklist
Please feel free to remove inapplicable items for your PR.
Changes