@@ -255,8 +255,41 @@ computational graph approach like Dask uses, etc.)._
255
255
256
256
## Possible direction for implementation
257
257
258
+ ### Rough prototypes
259
+
258
260
The ` cuDFDataFrame ` , ` cuDFColumn ` and ` cuDFBuffer ` sketched out by @kkraus14
259
261
[ here] ( https://github.com/data-apis/dataframe-api/issues/29#issuecomment-685123386 )
260
262
seems to be in the right direction.
261
263
264
+ [ This prototype] ( https://github.com/wesm/dataframe-protocol/pull/1 ) by Wes
265
+ McKinney was the first attempt, and has some useful features.
266
+
262
267
TODO: work this out after making sure we're all on the same page regarding requirements.
268
+
269
+
270
+ ### Relevant existing protocols
271
+
272
+ Here are the four most relevant existing protocols, and what requirements they support:
273
+
274
+ | * supports* | buffer protocol | ` __array_interface__ ` | DLPack | Arrow C Data Interface |
275
+ | ---------------------| :---------------:| :---------------------:| :------:| :----------------------:|
276
+ | Python API | | Y | (1) | |
277
+ | C API | Y | Y | Y | Y |
278
+ | arrays | Y | Y | Y | Y |
279
+ | dataframes | | | | |
280
+ | chunking | | | | |
281
+ | devices | | | Y | |
282
+ | bool/int/uint/float | Y | Y | Y | Y |
283
+ | missing data | (2) | (3) | (4) | Y |
284
+ | string dtype | (4) | (4) | | Y |
285
+ | datetime dtypes | | (5) | | Y |
286
+ | categoricals | (6) | (6) | (7) | (6) |
287
+
288
+ 1 . The Python API is only an interface to call the C API under the hood, it
289
+ doesn't contain a description of how the data is laid out in memory.
290
+ 2 . Can be done only via separate masks of boolean arrays.
291
+ 3 . ` __array_interface__ ` has a ` mask ` attribute, which is a separate boolean array also implementing the ` __array_interface__ ` protocol.
292
+ 4 . Only fixed-length strings as sequence of char or unicode.
293
+ 5 . Only NumPy datetime and timedelta, which are limited compared to what the Arrow format offers.
294
+ 6 . No explicit support, however categoricals can be mapped to either integers or strings.
295
+ 7 . No explicit support, categoricals can only be mapped to integers.
0 commit comments