TLDR ---- We have inconsistent silent-casting vs raising logic for numpy vs EA dtypes (and inconsistencies within EA dtypes). By deprecating silently casting to *object* dtype, we can *mostly* make the behaviors match. Background ---------- A number of Series/DataFrame methods will silently cast when dealing with mismatched values. With a numpy dtype, each of the following silently cast to float64: ser = pd.Series([1, 2, 3], dtype="i8") ser.shift(1, fill_value=1.5) ser.mask([True, False, False], 1.5) ser.where([False, True, True], 1.5) ser.replace(1, 1.5) ser[0] = 1.5 ser.fillna(1.5) # <- this one doesn't cast as it is a no-op If we were to pass "foo" or a pd.Period, these would coerce to object instead of float. By contrast, similar mixed-type operations with an ExtensionDtype Series _mostly_ raise: ser2 = pd.Series(pd.period_range("2016-01-01", periods=3, freq="D")) ser2.shift(1, fill_value=1.5) # <- ValueError ser2.mask([True, False, False], 1.5) # <- ValueError ser2.where([False, True, True], 1.5) # <- ValueError ser2.fillna(1.5) # <- TypeError ser2.replace(ser2[0], 1.5) # <- coerces to object ser2[0] = 1.5 # <- coerces to object ser3 = pd.Series([pd.NA, 2, 3], dtype="Int64") ser3.shift(1, fill_value=1.5) # <- TypeError ser3.mask([True, False, False], 1.5) # <- TypeError ser3.where([False, True, True], 1.5) # <- TypeError ser3.fillna(1.5) # <- TypeError ser3.replace(ser3[0], 1.5) # <- TypeError ser3[0] = 1.5 # <- TypeError timedelta64, datetime64, and datetime64tz mostly behave like the numpy dtypes, with a few exceptions: - shift raises on mismatch - fillna raises on mismatch for timedelta64, casts for the others Categorical mostly behaves like other ExtensionDtypes, except for replace which has special logic. Goals ----- - Have matching behavior across dtypes. - Share code. Options ------- 1) Change EA (and dt64/td64) behavior to match non-EA behavior 2) Change non-EA behavior to match EA behavior (or stricter xref https://github.com/pandas-dev/pandas/issues/39584) 3) Deprecate (and eventually raise on) silent casting to _object_ dtype, allowing silent casting otherwise. Here I am advocating for option 3). The advantages as I see them: A) For numpy dtypes, we retain the most useful cases (int->float) B) Deprecates cases most likely to be unintentional (e.g. typo "2016-01-01" -> "2p16-01-01" causing a datetime64 Series to silently cast) C) For td64/dt64/dt64tz/period, the *only* silent casting is to object, so this completely gets rid of special-casing among that code D) For IntegerArray, FloatingArray, IntervalArray leaves open the option of allowing e.g. Integer->Floating casting (xref https://github.com/pandas-dev/pandas/issues/25288#issuecomment-941762174) E) Does not preclude later deciding on the stricter options in 2)