Description
while digging into some test failures related to leaked ZkStateReader objects, i noticed a pattern which i beleive can be explained by the fact that ZkClientClusterStateProvider does not complain/fail if some caller tries to connect()/use it after it's already been closed – in this situation it will just re-create a new ZkStateReader (which is later leaked)
So in in situations where background/timer threads use a SolrClientCloudManager/ZkClientClusterStateProvider, we might see...
T1 : start shutdown... T1 : ...SolrClientCloudManager.close()... T1 : ...ZkClientClusterStateProvider.close()... T1 : ...ZkStateReader.close() T1 : ...zkStateReader = null; T 2: run background thread/task/trigger... T 2: ...get ZkClientClusterStateProvider T 2: ...call ZkClientClusterStateProvider.connect() T 2: ...zkStateReader = new ZkStateReader() /* LEAKED */ T 2: ... do something with ZkClientClusterStateProvider T 2: ...finish background thread/task/trigger T1 : ...finish shutdown of ZkClientClusterStateProvider / SolrClientCloudManager