Description
Currently it's not very clear how we should abstract data source v2 API. The abstraction should be unified between batch and streaming, or similar but have a well-defined difference between batch and streaming. And the abstraction should also include catalog/table.
An example of the abstraction:
batch: catalog -> table -> scan streaming: catalog -> table -> stream -> scan
We should refactor the data source v2 API according to the abstraction
Attachments
Issue Links
- blocks
-
SPARK-25186 Stabilize Data Source V2 API
-
- In Progress
-