Move ExpressionHash
logic to the driver #305
Description
We have this logic exposed from the sql/core.go
level:
// ExpressionHash is a SHA-1 checksum
type ExpressionHash []byte
// NewExpressionHash returns a new SHA1 hash for given Expression instance.
// SHA1 checksum will be calculated based on ex.String().
func NewExpressionHash(ex Expression) ExpressionHash {
h := sha1.Sum([]byte(ex.String()))
return ExpressionHash(h[:])
}
// DecodeExpressionHash decodes a hexadecimal string to ExpressionHash
func DecodeExpressionHash(hexstr string) (ExpressionHash, error) {
h, err := hex.DecodeString(hexstr)
if err != nil {
return nil, err
}
return ExpressionHash(h), nil
}
// EncodeExpressionHash encodes an ExpressionHash to hexadecimal string
func EncodeExpressionHash(h ExpressionHash) string {
return hex.EncodeToString(h)
}
And we are offering a way to deal with ExpressionHash
es directly from the Index
interface through the method ExpressionHashes
:
// Index is the basic representation of an index. It can be extended with
// more functionality by implementing more specific interfaces.
type Index interface {
// Get returns an IndexLookup for the given key in the index.
Get(key ...interface{}) (IndexLookup, error)
// Has checks if the given key is present in the index.
Has(key ...interface{}) (bool, error)
// ID returns the identifier of the index.
ID() string
// Database returns the database name this index belongs to.
Database() string
// Table returns the table name this index belongs to.
Table() string
// Expressions returns the indexed expressions. If the result is more than
// one expression, it means the index has multiple columns indexed. If it's
// just one, it means it may be an expression or a column.
ExpressionHashes() []ExpressionHash
// Driver ID of the index.
Driver() string
}
The thing is that all this is a specific detail of the pilosa driver implementation so we can move all this logic to there and don't exposed it as part of the API.
Instead, we can have a method Expressions
which returns the expressions indexed by an index as the string representation of an Expression
:
// Index is the basic representation of an index. It can be extended with
// more functionality by implementing more specific interfaces.
type Index interface {
// Get returns an IndexLookup for the given key in the index.
Get(key ...interface{}) (IndexLookup, error)
// Has checks if the given key is present in the index.
Has(key ...interface{}) (bool, error)
// ID returns the identifier of the index.
ID() string
// Database returns the database name this index belongs to.
Database() string
// Table returns the table name this index belongs to.
Table() string
// Expressions returns the indexed expressions as the string representation of a expression.
//If the result is more than one expression, it means the // index has multiple columns indexed. If it's just one, it
// means it may be an expression or a column.
Expressions() []string
// Driver ID of the index.
Driver() string
}
IndexDriver
interface will need to change its API:
NOW
---
// IndexDriver manages the coordination between the indexes and their
// representation on disk.
type IndexDriver interface {
// ID returns the unique name of the driver.
ID() string
// Create a new index. If exprs is more than one expression, it means the
// index has multiple columns indexed. If it's just one, it means it may
// be an expression or a column.
Create(db, table, id string, expressionHashes []ExpressionHash, config map[string]string) (Index, error)
// LoadAll loads all indexes for given db and table
LoadAll(db, table string) ([]Index, error)
// Save the given index
Save(ctx *Context, index Index, iter IndexKeyValueIter) error
// Delete the given index.
Delete(index Index) error
}
AFTER (Create method signature changed)
-----
// IndexDriver manages the coordination between the indexes and their
// representation on disk.
type IndexDriver interface {
// ID returns the unique name of the driver.
ID() string
// Create a new index. If exprs is more than one expression, it means the
// index has multiple columns indexed. If it's just one, it means it may
// be an expression or a column.
Create(db, table, id string, expressions []string, config map[string]string) (Index, error)
// LoadAll loads all indexes for given db and table
LoadAll(db, table string) ([]Index, error)
// Save the given index
Save(ctx *Context, index Index, iter IndexKeyValueIter) error
// Delete the given index.
Delete(index Index) error
}
------
(MAYBE
Create(db, table, id string, expressions []sql.Expression, config map[string]string) (Index, error)
AND RELY ON THE DRIVER TO DO THE STRING CONVERSION?)
------
Doing this we can get all the expressions indexed. One of the case of use for this is the show indexes
implementation. For the moment is not showing all the needed information because there's no way to get all the indexed expressions.
If we need build an Expression
instance I guess we could parse the string. Anyway, parsing this is not something we need for the moment. Outside the driver implementation we are using ExpressionHash
to compare expressions.
ExpressionHash
is the hash of the string representation of an Expression
so we can compare directly the strings.
To be able to use Expressions() []string
from the current pilosa index/driver implementation, we need to store these expressions as plain text in the index's config.yml
file.
Right now we are storing the expressions' hashes in the config file, but if we store the expressions' strings too, maybe it's not necessary store the hashes.
We can calculate the hashes from the string on the fly when creating or loading an index. So the specific pilosa index implementation will keep the hashes internally for its own purposes and will keep the string expressions to expose them through the Index
interface methods.
WDYT @src-d/data-retrieval ?