Skip to content

[client-v2] Move SerDe related logic to a separate package #2378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 28, 2025

Conversation

chernser
Copy link
Contributor

Summary

Closes

Checklist

Delete items not relevant to your PR:

  • Closes #
  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG
  • For significant changes, documentation in https://github.com/ClickHouse/clickhouse-docs was updated with further explanations or tutorials

@mzitnik
Copy link
Contributor

mzitnik commented May 21, 2025

Maybe we also need to to provide a getter method to retrieve the POJO SerDe?

@chernser chernser changed the title [client-v2] moved serde stadff to separate package [client-v2] Move SerDe related logic to a separate package May 22, 2025
@chernser chernser marked this pull request as ready for review May 27, 2025 19:59
@chernser chernser requested a review from mzitnik May 28, 2025 03:44
Copy link

@chernser chernser merged commit cec335e into main May 28, 2025
22 of 25 checks passed
@chernser chernser deleted the draft_standalone_serializer branch May 28, 2025 15:46
@chernser
Copy link
Contributor Author

/windsurf-review

@chernser chernser requested a review from Copilot May 30, 2025 14:52
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR moves all serialization/deserialization (SerDe) logic from the insert and query packages into a dedicated "serde" package. Key changes include updating package declarations and renaming interfaces and methods to better reflect their SerDe role.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
SerializerNotFoundException.java Updated package declaration from "insert" to "serde".
POJOSerDe.java New file implementing class-based registration for POJO serialization/deserialization.
POJOFieldSerializer.java and POJOFieldDeserializer.java Renamed interfaces to align with new SerDe terminology and moved their packages.
DataSerializationException.java Updated to use the new POJOFieldSerializer type.
SerializerUtils.java Updated to return POJOFieldDeserializer in compile method and update corresponding reference.
AbstractBinaryFormatReader.java Replaced deprecated POJOSetter with POJOFieldDeserializer in object deserialization.
Client.java Refactored code to leverage POJOSerDe for POJO registration and serializer/deserializer lookup.
Comments suppressed due to low confidence (1)

client-v2/src/main/java/com/clickhouse/client/api/data_formats/internal/SerializerUtils.java:798

  • The method name 'compilePOJOSetter' is inconsistent with its return type POJOFieldDeserializer; consider renaming it to 'compilePOJOFieldDeserializer' for clarity.
public static POJOFieldDeserializer compilePOJOSetter(Method setterMethod, ClickHouseColumn column) {

import java.util.concurrent.ConcurrentHashMap;

/**
* This class glues serialization staff together
Copy link
Preview

Copilot AI May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] There appears to be a typo in the javadoc ('staff' should likely be 'stuff' or rephrased appropriately); consider revising for clarity.

Suggested change
* This class glues serialization staff together
* This class glues serialization components together

Copilot uses AI. Check for mistakes.

Copy link

@windsurf-bot windsurf-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other comments (6)
  • client-v2/src/main/java/com/clickhouse/client/api/serde/POJOSerDe.java (62-63) The lambda in `registerClass` calls `getterMethod.invoke(obj)` without handling potential exceptions like `IllegalAccessException` or `InvocationTargetException`. This could lead to unexpected behavior or difficult-to-debug issues. Consider adding proper exception handling.
  • client-v2/src/main/java/com/clickhouse/client/api/Client.java (126-126) This PR moves serialization logic to a separate package, which is a good refactoring. However, I don't see any tests added for the new `POJOSerDe` class. According to the project guidelines, tests are required for newly added functionality. Please add unit tests that verify the functionality of the new `POJOSerDe` class and its integration with the Client class.
  • client-v2/src/main/java/com/clickhouse/client/api/data_formats/internal/AbstractBinaryFormatReader.java (11-11) This PR is moving serialization/deserialization logic to a separate package, but I don't see any tests included for this change. According to our guidelines, we should require tests for changes and newly added functionality to ensure the refactoring doesn't introduce regressions.
  • client-v2/src/main/java/com/clickhouse/client/api/data_formats/internal/SerializerUtils.java (798-798) This PR renames `POJOSetter` to `POJOFieldDeserializer` and moves it to a new package, but there are no tests included to verify that the functionality still works correctly after this change. Please add tests to ensure the renamed class functions as expected.
  • client-v2/src/main/java/com/clickhouse/client/api/serde/POJOFieldDeserializer.java (13-13) This PR renames and moves the interface to a more appropriate package, which is good. However, according to our guidelines, we require tests for changes. Please add tests that verify the functionality of the `POJOFieldDeserializer` interface in its new location.
  • client-v2/src/main/java/com/clickhouse/client/api/serde/POJOSerDe.java (97-107) The logic for determining the schema key from `TableSchema` is duplicated in `getFieldSerializers` and `getFieldDeserializers` methods. Consider extracting this to a private helper method to avoid duplication and ensure consistent behavior.
        private String getSchemaKey(TableSchema schema) {
            return schema.getTableName() == null ? schema.getQuery() : schema.getTableName();
        }
    
        public Map<String, POJOFieldSerializer> getFieldSerializers(Class<?> clazz, TableSchema schema) {
            // TODO: instead of storing a query list of columns should be stored and it will work both for read and write
            return serializers.getOrDefault(clazz, Collections.emptyMap())
                    .getOrDefault(getSchemaKey(schema), Collections.emptyMap());
        }
    
        public Map<String, POJOFieldDeserializer> getFieldDeserializers(Class<?> clazz, TableSchema schema) {
            return deserializers.getOrDefault(clazz,
                    Collections.emptyMap()).getOrDefault(getSchemaKey(schema), Collections.emptyMap());
        }
    


import java.io.IOException;
import java.io.OutputStream;
import java.lang.reflect.InvocationTargetException;

public interface POJOSerializer {
public interface POJOFieldSerializer {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR moves and renames the serialization interface, but I don't see any tests included to verify that existing code using this interface continues to work correctly after these changes. According to our guidelines, we should include tests for all changes. Please add tests that verify the functionality with the renamed interface.

/**
* This class glues serialization staff together
*/
public class POJOSerDe {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new class with complex serialization/deserialization logic, but I don't see any corresponding unit tests. Please add tests to verify the functionality of this class, especially around the method matching, serialization/deserialization, and schema key handling.

Comment on lines +54 to +55
Map<String, POJOFieldSerializer> schemaSerializers = new HashMap<>();
Map<String, POJOFieldDeserializer> schemaDeserializers = new ConcurrentHashMap<>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's inconsistent use of collection types - schemaSerializers uses HashMap while schemaDeserializers uses ConcurrentHashMap. Since both are used in similar contexts, they should use the same collection type for consistency and to avoid potential thread safety issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency is not a problem. Here is more question a need of concurrent collections. What should I used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants