Skip to content

Unable to make predictions using catboost model with either classification or regression #1681

Open
@tbergot

Description

@tbergot

Expected

We can make predictions using the catboost model with PGML

Problem

We are running into an error when running predictions when a model is trained as catboost, neither with 'classification' nor 'regression'.

SQL Error [XX000]: ERROR: ValueError: cannot reshape array of size 4 into shape (0)

Additional notes

  • We were able to make it work using xgboost (classification and regression). Only breaks with catboost
  • We also tried passing a different number of arguments (features) to the predict method, but it still fails

Reproduction steps

  1. Run the "quick start with docker" command (without the GPUs instruction as we don't need them):
docker run \
    -it \
    -v postgresml_data:/var/lib/postgresql \
    -p 5433:5432 \
    -p 8000:8000 \
    [ghcr.io/postgresml/postgresml:2.10.0](http://ghcr.io/postgresml/postgresml:2.10.0) \
    sudo -u postgresml psql -d postgresml
  1. In another terminal, run the following sql script using psql
CREATE EXTENSION pgml cascade;
-- Simple test data
CREATE TABLE pgml.training_data (
    id SERIAL PRIMARY KEY,
    feature1 REAL,
    feature2 REAL,
    feature3 REAL,
    target INTEGER
);
INSERT INTO pgml.training_data (feature1, feature2, feature3, target)
VALUES
    (1.2, 3.4, 5.6, 0),
    (2.3, 4.5, 6.7, 1),
    (3.4, 5.6, 7.8, 0),
    (4.5, 6.7, 8.9, 1),
    (5.6, 7.8, 9.0, 0);
--Training and using a xgboost model
SELECT * FROM pgml.train(
	project_name => 'classification_xgboost',
	task => 'classification',
    relation_name => 'pgml.training_data',
    y_column_name => 'target',
    algorithm => 'xgboost'
);
select pgml.predict(
    project_name => 'classification_xgboost',
    features => ARRAY[10::integer, 0.1::real, 0.3::real, 0.1::real]
) AS prediction;
-- Works well
-- Now with catboost
SELECT * FROM pgml.train(
	project_name => 'classification_catboost',
	task => 'classification',
    relation_name => 'pgml.training_data',
    y_column_name => 'target',
    algorithm => 'catboost'
);
select pgml.predict(
    project_name => 'classification_catboost',
    features => ARRAY[10::integer, 0.1::real, 0.3::real, 0.1::real]
) AS prediction;
--Throws error:
--SQL Error [XX000]: ERROR: ValueError: cannot reshape array of size 4 into shape (0)

Any help regarding this issue would be appreciated. Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions