Skip to content

Add builders for percentile and median accumulators/window functions #1139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions driver-core/src/main/com/mongodb/client/model/Accumulators.java
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,51 @@ public static <TExpression> BsonField avg(final String fieldName, final TExpress
return accumulatorOperator("$avg", fieldName, expression);
}

/**
* Returns a combination of a computed field and an accumulator that generates a BSON {@link org.bson.BsonType#ARRAY Array}
* containing computed values from the given {@code inExpression} based on the provided {@code pExpression}, which represents an array
* of percentiles of interest within a group, where each element is a numeric value between 0.0 and 1.0 (inclusive).
*
* @param fieldName The field computed by the accumulator.
* @param inExpression The input expression.
* @param pExpression The expression representing a percentiles of interest.
* @param method The method to be used for computing the percentiles.
* @param <InExpression> The type of the input expression.
* @param <PExpression> The type of the percentile expression.
* @return The requested {@link BsonField}.
* @mongodb.driver.manual reference/operator/aggregation/percentile/ $percentile
* @since 4.10
* @mongodb.server.release 7.0
*/
public static <InExpression, PExpression> BsonField percentile(final String fieldName, final InExpression inExpression,
final PExpression pExpression, final QuantileMethod method) {
notNull("fieldName", fieldName);
notNull("inExpression", inExpression);
notNull("pExpression", inExpression);
notNull("method", method);
return quantileAccumulator("$percentile", fieldName, inExpression, pExpression, method);
}

/**
* Returns a combination of a computed field and an accumulator that generates a BSON {@link org.bson.BsonType#DOUBLE Double }
* representing the median value computed from the given {@code inExpression} within a group.
Comment on lines +96 to +97
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am finding these docs surprisingly hard to follow. There appear to be 3 levels of abstraction, and at least 6 complex relations in the same sentence:

  • this combines and returns: computed field + accumulator
  • which generates: BSON
  • which represents: median (which, in turn: computed, from-given, within-group)

I think we should focus on what this method is used for, and break up the relevant pieces of information, starting with the most relevant:

Used to determine the median value for the group. The result is emitted under the specified field name. Each item in a group is a document, so the "input expression" is used to specify the numeric value for each document. This is typically a numeric field on each document, but any expression may be specified.

I think we should also specify what happens to non-numeric values, what the result is when there are no documents, and what MQL value represents the document in cases where a plain field is not used (usually you can just use fields, but it should be possible to get the "median" number of fields per document, and I don't see how to do this ... would we use $$CURRENT?). And, we should have tests that cover these cases.

If we do want to still focus on (or include) certain technical details rather than usage, I think we should be more clear, and still, for example, break up the information so it is easier to understand.

(The same I think applies to the other docs.)

Copy link
Collaborator

@katcharov katcharov Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed: we can address this later, in a future PR (or, decide not to).

*
* @param fieldName The field computed by the accumulator.
* @param inExpression The input expression.
* @param method The method to be used for computing the median.
* @param <InExpression> The type of the input expression.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should link the "type" (or the word "expression" on the param itself) of any new docs to MqlValue (which is in Beta).

Copy link
Member Author

@vbabanin vbabanin Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a good idea. However, I wonder if we should consider including the information that the Mql API can be utilized in situations where an expression is required, in the class level documentation instead? MqlValue can be used allmost in all methods of Accumulator class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, this would be inconsistent with the rest of the class.

Discussed: we can address this later, in a future PR (or, decide not to).

* @return The requested {@link BsonField}.
* @mongodb.driver.manual reference/operator/aggregation/median/ $median
* @since 4.10
* @mongodb.server.release 7.0
*/
public static <InExpression> BsonField median(final String fieldName, final InExpression inExpression, final QuantileMethod method) {
notNull("fieldName", fieldName);
notNull("inExpression", inExpression);
notNull("method", method);
return quantileAccumulator("$median", fieldName, inExpression, null, method);
}

/**
* Gets a field name for a $group operation representing the value of the given expression when applied to the first member of
* the group.
Expand Down Expand Up @@ -510,6 +555,17 @@ private static <OutExpression, NExpression> BsonField sortingPickNAccumulator(
.append("n", nExpression)));
}

private static <InExpression, PExpression> BsonField quantileAccumulator(final String quantileAccumulatorName,
final String fieldName, final InExpression inExpression,
@Nullable final PExpression pExpression, final QuantileMethod method) {
Document document = new Document("input", inExpression)
.append("method", method.toBsonValue());
if (pExpression != null) {
document.append("p", pExpression);
}
return accumulatorOperator(quantileAccumulatorName, fieldName, document);
}

private Accumulators() {
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Copyright 2008-present MongoDB, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.mongodb.client.model;

import com.mongodb.annotations.Sealed;


/**
* @see QuantileMethod#approximate()
* @since 4.10
* @mongodb.server.release 7.0
*/
@Sealed
public interface ApproximateQuantileMethod extends QuantileMethod {
}

76 changes: 76 additions & 0 deletions driver-core/src/main/com/mongodb/client/model/QuantileMethod.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
/*
* Copyright 2008-present MongoDB, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.mongodb.client.model;

import com.mongodb.annotations.Sealed;
import org.bson.BsonString;
import org.bson.BsonValue;

import static com.mongodb.assertions.Assertions.notNull;

/**
* This interface represents a quantile method used in quantile accumulators of the {@code $group} and
* {@code $setWindowFields} stages.
* <p>
* It provides methods for creating and converting quantile methods to {@link BsonValue}.
* </p>
*
* @see Accumulators#percentile(String, Object, Object, QuantileMethod)
* @see Accumulators#median(String, Object, QuantileMethod)
* @see WindowOutputFields#percentile(String, Object, Object, QuantileMethod, Window)
* @see WindowOutputFields#median(String, Object, QuantileMethod, Window)
* @since 4.10
* @mongodb.server.release 7.0
*/
@Sealed
public interface QuantileMethod {
/**
* Returns a {@link QuantileMethod} instance representing the "approximate" quantile method.
*
* @return The requested {@link QuantileMethod}.
*/
static ApproximateQuantileMethod approximate() {
return new QuantileMethodBson(new BsonString("approximate"));
}

/**
* Creates a {@link QuantileMethod} from a {@link BsonValue} in situations when there is no builder method
* that better satisfies your needs.
* This method cannot be used to validate the syntax.
* <p>
* <i>Example</i><br>
* The following code creates two functionally equivalent {@link QuantileMethod}s,
* though they may not be {@linkplain Object#equals(Object) equal}.
* <pre>{@code
* QuantileMethod method1 = QuantileMethod.approximate();
* QuantileMethod method2 = QuantileMethod.of(new BsonString("approximate"));
* }</pre>
*
* @param method A {@link BsonValue} representing the required {@link QuantileMethod}.
* @return The requested {@link QuantileMethod}.
*/
static QuantileMethod of(final BsonValue method) {
notNull("method", method);
return new QuantileMethodBson(method);
}

/**
* Converts this object to {@link BsonValue}.
*
* @return A {@link BsonValue} representing this {@link QuantileMethod}.
*/
BsonValue toBsonValue();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/*
* Copyright 2008-present MongoDB, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.mongodb.client.model;

import org.bson.BsonValue;

import java.util.Objects;

final class QuantileMethodBson implements ApproximateQuantileMethod {
private final BsonValue bsonValue;

QuantileMethodBson(final BsonValue bsonValue) {
this.bsonValue = bsonValue;
}

@Override
public BsonValue toBsonValue() {
return bsonValue;
}

@Override
public boolean equals(final Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
QuantileMethodBson that = (QuantileMethodBson) o;
return Objects.equals(bsonValue, that.bsonValue);
}

@Override
public int hashCode() {
return Objects.hash(bsonValue);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,62 @@ public static <TExpression> WindowOutputField avg(final String path, final TExpr
return simpleParameterWindowFunction(path, "$avg", expression, window);
}

/**
* Builds a window output field of percentiles of the evaluation results of the {@code inExpression}
* over documents in the specified {@code window}. The {@code pExpression} parameter represents an array of
* percentiles of interest, with each element being a numeric value between 0.0 and 1.0 (inclusive).
*
* @param path The output field path.
* @param inExpression The input expression.
* @param pExpression The expression representing a percentiles of interest.
* @param method The method to be used for computing the percentiles.
* @param window The window.
* @param <InExpression> The type of the input expression.
* @param <PExpression> The type of the percentile expression.
* @return The constructed windowed output field.
* @mongodb.driver.manual reference/operator/aggregation/percentile/ $percentile
* @since 4.10
* @mongodb.server.release 7.0
*/
public static <InExpression, PExpression> WindowOutputField percentile(final String path, final InExpression inExpression,
final PExpression pExpression, final QuantileMethod method,
@Nullable final Window window) {
notNull("path", path);
notNull("inExpression", inExpression);
notNull("pExpression", pExpression);
notNull("method", method);
Map<ParamName, Object> args = new LinkedHashMap<>(3);
args.put(ParamName.INPUT, inExpression);
args.put(ParamName.P_LOWERCASE, pExpression);
args.put(ParamName.METHOD, method.toBsonValue());
return compoundParameterWindowFunction(path, "$percentile", args, window);
}

/**
* Builds a window output field representing the median value of the evaluation results of the {@code inExpression}
* over documents in the specified {@code window}.
*
* @param inExpression The input expression.
* @param method The method to be used for computing the median.
* @param window The window.
* @param <InExpression> The type of the input expression.
* @return The constructed windowed output field.
* @mongodb.driver.manual reference/operator/aggregation/median/ $median
* @since 4.10
* @mongodb.server.release 7.0
*/
public static <InExpression> WindowOutputField median(final String path, final InExpression inExpression,
final QuantileMethod method,
@Nullable final Window window) {
notNull("path", path);
notNull("inExpression", inExpression);
notNull("method", method);
Map<ParamName, Object> args = new LinkedHashMap<>(2);
args.put(ParamName.INPUT, inExpression);
args.put(ParamName.METHOD, method.toBsonValue());
return compoundParameterWindowFunction(path, "$median", args, window);
}

/**
* Builds a window output field of the sample standard deviation of the evaluation results of the {@code expression} over the
* {@code window}.
Expand Down Expand Up @@ -1013,11 +1069,13 @@ private enum ParamName {
UNIT("unit"),
N_UPPERCASE("N"),
N_LOWERCASE("n"),
P_LOWERCASE("p"),
ALPHA("alpha"),
OUTPUT("output"),
BY("by"),
DEFAULT("default"),
SORT_BY("sortBy");
SORT_BY("sortBy"),
METHOD("method");

private final String value;

Expand Down
Loading