Skip to content
This repository was archived by the owner on Jan 28, 2021. It is now read-only.
This repository was archived by the owner on Jan 28, 2021. It is now read-only.

Move ExpressionHash logic to the driver #305

Closed
@mcarmonaa

Description

@mcarmonaa

We have this logic exposed from the sql/core.go level:

// ExpressionHash is a SHA-1 checksum
type ExpressionHash []byte

// NewExpressionHash returns a new SHA1 hash for given Expression instance.
// SHA1 checksum will be calculated based on ex.String().
func NewExpressionHash(ex Expression) ExpressionHash {
	h := sha1.Sum([]byte(ex.String()))
	return ExpressionHash(h[:])
}

// DecodeExpressionHash  decodes a hexadecimal string to ExpressionHash
func DecodeExpressionHash(hexstr string) (ExpressionHash, error) {
	h, err := hex.DecodeString(hexstr)
	if err != nil {
		return nil, err
	}
	return ExpressionHash(h), nil
}

// EncodeExpressionHash encodes an ExpressionHash to hexadecimal string
func EncodeExpressionHash(h ExpressionHash) string {
	return hex.EncodeToString(h)
}

And we are offering a way to deal with ExpressionHashes directly from the Index interface through the method ExpressionHashes:

// Index is the basic representation of an index. It can be extended with
// more functionality by implementing more specific interfaces.
type Index interface {
	// Get returns an IndexLookup for the given key in the index.
	Get(key ...interface{}) (IndexLookup, error)
	// Has checks if the given key is present in the index.
	Has(key ...interface{}) (bool, error)
	// ID returns the identifier of the index.
	ID() string
	// Database returns the database name this index belongs to.
	Database() string
	// Table returns the table name this index belongs to.
	Table() string
	// Expressions returns the indexed expressions. If the result is more than
	// one expression, it means the index has multiple columns indexed. If it's
	// just one, it means it may be an expression or a column.
	ExpressionHashes() []ExpressionHash
	// Driver ID of the index.
	Driver() string
}

The thing is that all this is a specific detail of the pilosa driver implementation so we can move all this logic to there and don't exposed it as part of the API.

Instead, we can have a method Expressions which returns the expressions indexed by an index as the string representation of an Expression:

// Index is the basic representation of an index. It can be extended with
// more functionality by implementing more specific interfaces.
type Index interface {
	// Get returns an IndexLookup for the given key in the index.
	Get(key ...interface{}) (IndexLookup, error)
	// Has checks if the given key is present in the index.
	Has(key ...interface{}) (bool, error)
	// ID returns the identifier of the index.
	ID() string
	// Database returns the database name this index belongs to.
	Database() string
	// Table returns the table name this index belongs to.
	Table() string
    // Expressions returns the indexed expressions as the string representation of a expression.
    //If the result is more than one expression, it means the // index has multiple columns indexed. If it's just one, it
    // means it may be an expression or a column.
	Expressions() []string
	// Driver ID of the index.
	Driver() string
}

IndexDriver interface will need to change its API:

NOW
---
// IndexDriver manages the coordination between the indexes and their
// representation on disk.
type IndexDriver interface {
	// ID returns the unique name of the driver.
	ID() string
	// Create a new index. If exprs is more than one expression, it means the
	// index has multiple columns indexed. If it's just one, it means it may
	// be an expression or a column.
	Create(db, table, id string, expressionHashes []ExpressionHash, config map[string]string) (Index, error)
	// LoadAll loads all indexes for given db and table
	LoadAll(db, table string) ([]Index, error)
	// Save the given index
	Save(ctx *Context, index Index, iter IndexKeyValueIter) error
	// Delete the given index.
	Delete(index Index) error
}

AFTER (Create method signature changed)
-----
// IndexDriver manages the coordination between the indexes and their
// representation on disk.
type IndexDriver interface {
	// ID returns the unique name of the driver.
	ID() string
	// Create a new index. If exprs is more than one expression, it means the
	// index has multiple columns indexed. If it's just one, it means it may
	// be an expression or a column.
	Create(db, table, id string, expressions []string, config map[string]string) (Index, error)
	// LoadAll loads all indexes for given db and table
	LoadAll(db, table string) ([]Index, error)
	// Save the given index
	Save(ctx *Context, index Index, iter IndexKeyValueIter) error
	// Delete the given index.
	Delete(index Index) error
}

------
(MAYBE
    Create(db, table, id string, expressions []sql.Expression, config map[string]string) (Index, error)

AND RELY ON THE DRIVER TO DO THE STRING CONVERSION?)
------

Doing this we can get all the expressions indexed. One of the case of use for this is the show indexes implementation. For the moment is not showing all the needed information because there's no way to get all the indexed expressions.

If we need build an Expression instance I guess we could parse the string. Anyway, parsing this is not something we need for the moment. Outside the driver implementation we are using ExpressionHash to compare expressions.

ExpressionHash is the hash of the string representation of an Expression so we can compare directly the strings.

To be able to use Expressions() []string from the current pilosa index/driver implementation, we need to store these expressions as plain text in the index's config.yml file.

Right now we are storing the expressions' hashes in the config file, but if we store the expressions' strings too, maybe it's not necessary store the hashes.

We can calculate the hashes from the string on the fly when creating or loading an index. So the specific pilosa index implementation will keep the hashes internally for its own purposes and will keep the string expressions to expose them through the Index interface methods.

WDYT @src-d/data-retrieval ?

Metadata

Metadata

Assignees

Labels

proposalproposal for new additions or changes

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions