Implement a new pilosa (server-less) index driver 

So far we have pilosa index driver which uses pilosa client library `github.com/pilosa/go-pilosa`. This is http client which requires running pilosa as a external http service.

Now, it's possible to use `github.com/pilosa/pilosa` as a library. Some working (*trash-code*) prototype you can find here: https://github.com/kuba--/go-mysql-server/tree/noserver-pilosa/sql/index/pilosa

Instead of refactor the _current pilosa driver_ I propose to create a new one (_pilosalib_) but to keep the same functionality. Having a new driver let us compare results and performance and when we're ready we can just switch to the new driver and get rid of the old one.

_Pilosa as a library_ will create a new files and directories inside a `root` folder (passed to the `NewDriver` function). Because of that and to avoid overwriting and loading issues across drivers, we'll have to refactor an index folder structure.
I suggest to add an extra directory (`DriverID`) under the `root` folder, so each driver will have a own space, mapping, config files (if needed) and processing file, e.g.:

```sh
[root]
|- [pilosalib]
|    |- [db]
|        |-[table]
|           |- id.map #mapping file
|           |- id.cfg # config file is optional for pilosalib
|           |- [idx-sha1(id, expressions)] # pilosa folder   
|           |- [idx-sha1(id, expressions)] # pilosa folder
|
|- [pilosa]
     |- [db]
     |- [table]
         |- id.map # mapping file
         |- id.cfg # config file
         |- id.lock # optional lock/processing file 
```

In other words, driver will create only following folders under the `root`: `driver_id/db/table`. 
Mapping file will be renamed to `index_id.map`.
Config file will be renamed to `index_id.cfg`.
Processing/Lock file will be renamed to `index_id.lock`

All other potential subfolders may be created by thirdparties. For instance _pilosalib_ creates following substructure per index:
```sh
├── i-d9b85cddd6ac716f0326c32c7ba4bd9ae2aeb558
│       ├── f-daef88b79b2d1e52c779c70d4aa814546a1b10c2
│       │       └── views
│       │                 └── standard
│       │                 └── fragments
│       │                           ├── 0
│       │                           └── 0.cache
```
where ` i-d9b85cddd6ac716f0326c32c7ba4bd9ae2aeb558` is an example index name, and `f-daef88b79b2d1e52c779c70d4aa814546a1b10c2` is an example field name.

### Caveats 
- mapping: Pilosa'a API allows to set up column and row attributes as `map[uin64]interface{}`, but it will require to implement own storage to satisfy `StoreAttr` interface or we can try to reuse pilosa's internal boltdb cache implementation. The second approach sounds in theory promising, but our current mapping also relies on boltdb, is more customised and optimised for our needs and gives us more independence (instead of tightly couple with pilosa). Moreover,  we don't have to change this component, just reuse it in a new driver.
So in the first step we don't need to change mapping. In the next step we can think to use column attributes to store `colID -> location` mapping, but for `value -> rowID` we may still use external mapping.

- versioning: The latest pilosa release v1.0.1 has some bugs which are already fixed on the master branch. We can start implementation by vendoring the master and later switch to v1.0.2 release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement a new pilosa (server-less) index driver #302

Caveats

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement a new pilosa (server-less) index driver #302

Description

Caveats

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions