Feat: Add debugging terminal support for CustomJob, HyperparameterTun…#699
Conversation
b71b3fb to
84304ed
Compare
6518fb3 to
41eca99
Compare
google/cloud/aiplatform/jobs.py
Outdated
| (Dict[str, str]) - web access uris of the custom job | ||
| """ | ||
| self._sync_gca_resource() | ||
| return self._gca_resource.web_access_uris |
There was a problem hiding this comment.
Does this field need to be cast to a dict?
There was a problem hiding this comment.
It works the same as labels, I see we have labels as it is and it is a dict
There was a problem hiding this comment.
That looks like a bug:
isinstance(ds.labels, dict)
# False
type(ds.labels)
# google.protobuf.pyext._message.ScalarMapContainerI created an issue to track that here: b/203653647
Preference to not carry that issue over.
google/cloud/aiplatform/jobs.py
Outdated
| self._sync_gca_resource() | ||
|
|
||
| if self._gca_resource.trials: | ||
| return self._gca_resource.trials[-1].web_access_uris |
There was a problem hiding this comment.
Can trials execute in parallel?
There was a problem hiding this comment.
Trials can be executed in parallel upon parallel_trial_count is set. Updated for HyperparameterTuningJob to check web_access_uris of trials in parallel.
| self._gca_resource.training_task_metadata | ||
| and self._gca_resource.training_task_metadata.get("backingCustomJob") | ||
| and self._gca_resource.training_task_inputs.get("enable_web_access") | ||
| and not self._has_logged_web_access_uris |
There was a problem hiding this comment.
Is it possible the web_access_uris have changed throughout the run? If, for example, one of the workers failed.
There was a problem hiding this comment.
The current service setup is that if workers failed and restarted, the web_access_uris will redirect to the new workers, but won't change itself.
58c656d to
0fbb148
Compare
81a4ae4 to
5e44f24
Compare
71f5813 to
706423f
Compare
706423f to
7214e21
Compare
7214e21 to
f8b67ea
Compare
Fixes #<b/195449603> 🦕