Skip to content

Do not call default factories taking the data argument if a validation error already occurred #1623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/errors/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,9 @@ error_types! {
class_name: {ctx_type: String, ctx_fn: field_from_context},
},
// ---------------------
// Default factory not called (happens when there's already an error and the factory takes data)
DefaultFactoryNotCalled {},
// ---------------------
// None errors
NoneRequired {},
// ---------------------
Expand Down Expand Up @@ -490,6 +493,7 @@ impl ErrorType {
Self::ModelAttributesType {..} => "Input should be a valid dictionary or object to extract fields from",
Self::DataclassType {..} => "Input should be a dictionary or an instance of {class_name}",
Self::DataclassExactType {..} => "Input should be an instance of {class_name}",
Self::DefaultFactoryNotCalled {..} => "The default factory uses validated data, but at least one validation error occurred",
Self::NoneRequired {..} => "Input should be None",
Self::GreaterThan {..} => "Input should be greater than {gt}",
Self::GreaterThanEqual {..} => "Input should be greater than or equal to {ge}",
Expand Down
2 changes: 1 addition & 1 deletion src/validators/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -618,7 +618,7 @@ pub fn build_validator(
pub struct Extra<'a, 'py> {
/// Validation mode
pub input_type: InputType,
/// This is used as the `data` kwargs to validator functions
/// This is used as the `data` kwargs to validator functions and default factories (if they accept the argument)
pub data: Option<Bound<'py, PyDict>>,
/// whether we're in strict or lax mode
pub strict: Option<bool>,
Expand Down
15 changes: 10 additions & 5 deletions src/validators/model_fields.rs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same needs to be done for DCs/TDs

Original file line number Diff line number Diff line change
Expand Up @@ -191,13 +191,18 @@ impl Validator for ModelFieldsValidator {
fields_set_vec.push(field.name_py.clone_ref(py));
fields_set_count += 1;
}
Err(ValError::Omit) => continue,
Err(ValError::LineErrors(line_errors)) => {
for err in line_errors {
errors.push(lookup_path.apply_error_loc(err, self.loc_by_alias, &field.name));
Err(e) => {
state.has_field_error = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track this on the state? Should we adjust the algorithm to just never call .default_value if there's any errors?

(We could go further and rearrange things a bit so that all default values only get filled in after the whole input is processed, there is some overlap with #1620 🤔)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to track this on the state? Should we adjust the algorithm to just never call .default_value if there's any errors?

It's only relevant for default factories taking the data argument, so if you just have a "static" default it shouldn't matter. And because the logic to call the default value is nested under the current field validator, I can't really "influence" the logic from the model field.

This state approach doesn't look right, but I couldn't find a better way to do so 🤔

(We could go further and rearrange things a bit so that all default values only get filled in after the whole input is processed, there is some overlap with #1620 🤔)

Would this allow using other field values independently from the field order?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I missed the default factories guarantee field order; this makes a lot of sense.

Because of the existence of PydanticUndefined as a way to ask for defaults, I think my original proposal actually doesn't make sense. 🤔

... but I do worry about when this flag is reset. Currently once you set it it's true for the rest of the validation. This seems problematic in unions, for example.

Copy link
Member Author

@Viicos Viicos Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this is one of the reasons it doesn't look right; however, for unions I think it should be alright, as there's a boundary in the validation error.

The error we are handling here is the validation error of the field. If this field is defined like:

class Model(BaseModel):
    a: int | str

And a validation error happens on int, the field validator will try validating against str. At this point, we don't know what happened, and we will only get a val error if validation against str failed as well.

Edit: Actually it might be wrong if we have:

class Model1(BaseModel):
    a: int

class Model2(BaseModel): ...

class Model(BaseModel):
    m: Model1 | Model2

if validation fails for Model1.a, it will mutate the state (if it is the same state than when validating Model).

Now the question is: are unions the only concern? If so, we can make sure we somehow reset this state value on union validation, but there might be other concerns apart from unions :/

match e {
ValError::Omit => continue,
ValError::LineErrors(line_errors) => {
for err in line_errors {
errors.push(lookup_path.apply_error_loc(err, self.loc_by_alias, &field.name));
}
}
err => return Err(err),
}
}
Err(err) => return Err(err),
}
continue;
}
Expand Down
5 changes: 5 additions & 0 deletions src/validators/validation_state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ pub struct ValidationState<'a, 'py> {
pub fields_set_count: Option<usize>,
// True if `allow_partial=true` and we're validating the last element of a sequence or mapping.
pub allow_partial: PartialMode,
// Whether at least one field had a validation error. This is used in the context of structured types
// (models, dataclasses, etc), where we need to know if a validation error occurred before calling
// a default factory that takes the validated data.
pub has_field_error: bool,
// deliberately make Extra readonly
extra: Extra<'a, 'py>,
}
Expand All @@ -36,6 +40,7 @@ impl<'a, 'py> ValidationState<'a, 'py> {
exactness: None,
fields_set_count: None,
allow_partial,
has_field_error: false,
extra,
}
}
Expand Down
7 changes: 6 additions & 1 deletion src/validators/with_default.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use pyo3::PyVisit;
use super::{build_validator, BuildValidator, CombinedValidator, DefinitionsBuilder, ValidationState, Validator};
use crate::build_tools::py_schema_err;
use crate::build_tools::schema_or_config_same;
use crate::errors::{LocItem, ValError, ValResult};
use crate::errors::{ErrorTypeDefaults, LocItem, ValError, ValResult};
use crate::input::Input;
use crate::py_gc::PyGcTraverse;
use crate::tools::SchemaDict;
Expand Down Expand Up @@ -180,6 +180,11 @@ impl Validator for WithDefaultValidator {
outer_loc: Option<impl Into<LocItem>>,
state: &mut ValidationState<'_, 'py>,
) -> ValResult<Option<PyObject>> {
if matches!(self.default, DefaultType::DefaultFactory(_, true)) && state.has_field_error {
// The default factory might use data from fields that failed to validate, and this results
// in an unhelpul error.
return Err(ValError::new(ErrorTypeDefaults::DefaultFactoryNotCalled, input));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit annoying because I don't have access to any input inside the default_value method. @davidhewitt any idea how I could workaround this in a clean way? I could wrap up the return type in an enum of either Option<PyObject> or a new singleton sentinel value, but the rtype is already complex enough 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the current behavior is a bit weird:

class Model(BaseModel):
    a: int
    b: int

Model(b="fails")
"""
pydantic_core._pydantic_core.ValidationError: 2 validation errors for Model
a
  Field required [type=missing, input_value={'b': 'fails'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.11/v/missing
b
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='fails', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/int_parsing
"""

For b, a value was provided and is used as input_value. For a, no value is provided and pydantic-core uses the whole input dict as input_value (on L214, input is the provided dict):

match field.validator.default_value(py, Some(field.name.as_str()), state) {
Ok(Some(value)) => {
// Default value exists, and passed validation if required
model_dict.set_item(&field.name_py, value)?;
}
Ok(None) => {
// This means there was no default value
errors.push(field.lookup_key.error(
ErrorTypeDefaults::Missing,
input,
self.loc_by_alias,
&field.name,
));

So perhaps we could just use PydanticUndefined as an input value here, and then in the with_default validator, I can return:

Err(ValError::new(ErrorTypeDefaults::DefaultFactoryNotCalled, &self.undefined))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting, I wonder if the error for a should not include input_value or input_type at all? Or maybe they should be PydanticUndefined 🤔

maybe we could special-case PydanticUndefined in errors, so it would just show

Field required [type=missing]

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if this can be done without too much special casing, I agree it would be better.

}
match self.default.default_value(py, state.extra().data.as_ref())? {
Some(stored_dft) => {
let dft: Py<PyAny> = if self.copy_default {
Expand Down
30 changes: 30 additions & 0 deletions tests/validators/test_with_default.py
Original file line number Diff line number Diff line change
Expand Up @@ -818,3 +818,33 @@ def _raise(ex: Exception) -> None:
v.validate_python(input_value)

assert exc_info.value.errors(include_url=False, include_context=False) == expected


def test_default_factory_not_called_if_existing_error() -> None:
class Test:
def __init__(self, a: int, b: int):
self.a = a
self.b = b

schema = core_schema.model_schema(
cls=Test,
schema=core_schema.model_fields_schema(
computed_fields=[],
fields={
'a': core_schema.model_field(
schema=core_schema.int_schema(),
),
'b': core_schema.model_field(
schema=core_schema.with_default_schema(
schema=core_schema.int_schema(),
default_factory=lambda data: data['a'],
default_factory_takes_data=True,
),
),
},
),
)

v = SchemaValidator(schema)
with pytest.raises(ValidationError):
v.validate_python({'a': 'not_an_int'})
Loading