HOME

sh-3ll 1.0
DIR:/usr/local/lib64/python3.6/site-packages/pandas/core/groupby/__pycache__/
Current File : //usr/local/lib64/python3.6/site-packages/pandas/core/groupby/__pycache__/generic.cpython-36.pyc
3

���h�@s�dZddlmZmZddlZddlmZddlmZddl	Z	ddl	m
Z
mZmZm
Z
mZmZmZmZmZmZmZmZmZmZddlZddlZddlmZddlmZmZdd	l m!Z!m"Z"m#Z#dd
l$m%Z%m&Z&m'Z'm(Z(m)Z)ddl*m+Z+m,Z,m-Z-m.Z.m/Z/m0Z0m1Z1m2Z2m3Z3ddl4m5Z5m6Z6dd
l7m8Z8m9Z9m:Z:ddl;j<j=Z=ddl>m?Z?m@Z@ddlAj<jBZCddlDmEZEddlFmGZGddlHmIZImJZJmKZKddlLmMZMddlNmOZOmPZPmQZQmRZRmSZSddlTmUZUmVZVmWZWddlXj<jYjMZZddl[m\Z\m]Z]ddl^m_Z_ddl`maZambZbmcZcmdZdddlemfZfe
�rddl[mgZgedddg�ZheeiedeffZje	jkd�Zleieed�d d!�Zmeeeeid"�d#d$�Znene_eMjo�Gd%d&�d&eOe_��ZpeneGeMjq�Gd'd(�d(eOeG��ZreGeGd)�d*d+�ZsdS),z�
Define the SeriesGroupBy and DataFrameGroupBy
classes that hold the groupby interfaces (and some implementations).

These are user facing as the result of the ``df.groupby(...)`` operations,
which here returns a DataFrameGroupBy object.
�)�abc�
namedtupleN)�partial)�dedent)�
TYPE_CHECKING�Any�Callable�Dict�	FrozenSet�Iterable�List�Mapping�Optional�Sequence�Tuple�Type�Union�cast)�lib)�
FrameOrSeries�FrameOrSeriesUnion)�Appender�Substitution�doc)�maybe_cast_result�maybe_cast_result_dtype�maybe_convert_objects�maybe_downcast_numeric�maybe_downcast_to_dtype)	�ensure_int64�ensure_platform_int�is_bool�is_integer_dtype�is_interval_dtype�is_numeric_dtype�is_object_dtype�	is_scalar�needs_i8_conversion)�isna�notna)�maybe_mangle_lambdas�reconstruct_func�validate_func_kwargs)�	DataError�SpecificationError)�!create_series_with_explicit_dtype)�	DataFrame)�ABCDataFrame�	ABCSeries�NDFrame)�base)�GroupBy�
_agg_template�_apply_docs�_transform_template�get_groupby)�Index�
MultiIndex�all_indexes_same)�BlockManager�
make_block)�Series)�NUMBA_FUNC_CACHE�generate_numba_func�maybe_use_numba�split_for_numba)�boxplot_frame_groupby)�Block�NamedAgg�columnZaggfunc.�ScalarResult)�name�klasscs0�fdd�}t|��}|jpd|_�|_t|�S)z�
    Create a property for a GroupBy subclass to dispatch to DataFrame/Series.

    Parameters
    ----------
    name : str
    klass : {DataFrame, Series}

    Returns
    -------
    property
    cs
|j��S)N)Z
_make_wrapper)�self)rI��=/tmp/pip-build-5_djhm0z/pandas/pandas/core/groupby/generic.py�proptszgenerate_property.<locals>.prop�)�getattr�__doc__�__name__�property)rIrJrNZ
parent_methodrL)rIrM�generate_propertyfs

rT)rJ�	allowlistcs��fdd�}|S)a�
    Create GroupBy member defs for DataFrame/Series names in a allowlist.

    Parameters
    ----------
    klass : DataFrame or Series class
        class where members are defined.
    allowlist : frozenset[str]
        Set of names of klass methods to be constructed

    Returns
    -------
    class decorator

    Notes
    -----
    Since we don't want to override methods explicitly defined in the
    base class, any such name is skipped.
    cs4x.�D]&}t||�rqt|��}t|||�qW|S)N)�hasattrrT�setattr)�clsrIrN)rUrJrLrM�pinner�s


z*pin_allowlisted_properties.<locals>.pinnerrL)rJrUrYrL)rUrJrM�pin_allowlisted_properties}srZcs�eZdZejZeed�dd�Ze	dd��Z
ed�Ze
edjded	d
���fdd��Zeeed
d�d=ddd�dd��ZeZdd�Zeejeeejffeeeeefd�dd�Zeejeeejffeeeeefd�dd�Zeejeeejffed�dd�Z d>eee!e"e#e$d�dd �Z%d!d"�Z&e'd
d#�e
e(�ddd�d$d%���Z)d&dd�d'd(�Z*ed�d)d*�Z+d?d,d-�Z,d@e#ed.�d/d0�Z-eej.�d1d2��Z.dAd3d4�Z/ed�d5d6�Z0d7d8�Z1dBd;d<�Z2�Z3S)C�
SeriesGroupBy)�returnccs|jVdS)N)�
_selected_obj)rKrLrLrM�_iterate_slices�szSeriesGroupBy._iterate_slicescCs|jdkr|jjS|jSdS)z�
        since we are a series, we by definition only have
        a single name, but may be the result of a selection or
        the name of our object
        N)�
_selection�objrI)rKrLrLrM�_selection_name�s
zSeriesGroupBy._selection_namea�
    Examples
    --------
    >>> s = pd.Series([1, 2, 3, 4])

    >>> s
    0    1
    1    2
    2    3
    3    4
    dtype: int64

    >>> s.groupby([1, 1, 2, 2]).min()
    1    1
    2    3
    dtype: int64

    >>> s.groupby([1, 1, 2, 2]).agg('min')
    1    1
    2    3
    dtype: int64

    >>> s.groupby([1, 1, 2, 2]).agg(['min', 'max'])
       min  max
    1    1    2
    2    3    4

    The output column names can be controlled by passing
    the desired column names and aggregations as keyword arguments.

    >>> s.groupby([1, 1, 2, 2]).agg(
    ...     minimum='min',
    ...     maximum='max',
    ... )
       minimum  maximum
    1        1        2
    2        3        4�templateZseriesZseries_examples)�input�examplescst�j|f|�|�S)N)�super�apply)rK�func�args�kwargs)�	__class__rLrMrf�szSeriesGroupBy.applyr?)rdrJN)�engine�
engine_kwargsc

Osl|dk}d}|r t|�\}}i}t|t�r:t||�||�St|tj�rdt|�}|j|�}|rb||_n�|j	|�}	|	r�|r�|r�t||	��S|j
jdkr�|j|f|�||d�|��Sy|j|f|�||d�|��St
tfk
r�|j|f|�|�}
YnXtt|
�|j
jdd�}t|
|td�}|j�s0td�t|t��rhddlm}||j�ddd	�|j�D�d
�}|S)N�)rkrlr)rI)�index�dtype_if_emptyzWarning, ignoring as_index=True)�concatcSsg|]
}|j�qSrL)�label)�.0�keyrLrLrM�
<listcomp>sz+SeriesGroupBy.aggregate.<locals>.<listcomp>)�axis�keys)r,�
isinstance�strrPrrr*�_aggregate_multiple_funcs�columns�_get_cython_func�grouper�nkeys�_python_agg_general�
ValueError�KeyError�_aggregate_namedr:�sorted�namesr/�object�as_index�print�dictZpandasrp�valuesrv)
rKrgrkrlrhri�
relabelingrz�retZcyfunc�resultrnrprLrLrM�	aggregate�s@


 zSeriesGroupBy.aggregatec
Cs(t|t�r4t|jt�rtd��t|j��}|j�}n^tdd�|D��rbdd�|D�}t	t
|��}n0g}x |D]}|jtj
|�p�|�qlWt
||�}i}xVt|�D]J\}\}}|}||jkr�tj|�}|j�||_|j|�|tj||d�<q�Wtdd�|j�D���r
|S|j|dd�}	|jj|	|d	�S)
Nznested renamer is not supportedcss|]}t|ttf�VqdS)N)rw�tuple�list)rr�xrLrLrM�	<genexpr>(sz:SeriesGroupBy._aggregate_multiple_funcs.<locals>.<genexpr>cSs&g|]}t|ttf�s||fn|�qSrL)rwr�r�)rrr�rLrLrMrt)sz;SeriesGroupBy._aggregate_multiple_funcs.<locals>.<listcomp>)rq�positioncss|]}t|t�VqdS)N)rwr0)rrr�rLrLrMr�As)rn)rz)rwr�r]r?r.r�rv�items�any�next�zip�append�comZget_callable_name�	enumerate�copy�_reset_cacher_r�r4�	OutputKeyr��_wrap_aggregated_outputr`�_constructor_expanddim)
rK�argrz�f�results�idxrIrgr`�outputrLrLrMrys0





z'SeriesGroupBy._aggregate_multiple_funcs)r�rnr\cCsxdd�|j�D�}tdd�|D��}t|�dkrH|jj||d�}||_n,|jsj|jj|d||dd�}n
|jj�}|S)	a"
        Wraps the output of a SeriesGroupBy operation into the expected result.

        Parameters
        ----------
        output : Mapping[base.OutputKey, Union[Series, np.ndarray]]
            Data to wrap.
        index : pd.Index or None
            Index to apply to the output.

        Returns
        -------
        Series or DataFrame

        Notes
        -----
        In the vast majority of cases output and columns will only contain one
        element. The exception is operations that expand dimensions, like ohlc.
        cSsi|]\}}||j�qSrL)r�)rrrs�valrLrLrM�
<dictcomp>asz5SeriesGroupBy._wrap_series_output.<locals>.<dictcomp>css|]}|jVqdS)N)rq)rrrsrLrLrMr�bsz4SeriesGroupBy._wrap_series_output.<locals>.<genexpr>rm)rnr)rnrI)r�r:�lenr`r�rz�empty�_constructor)rKr�rn�indexed_outputrzr�rLrLrM�_wrap_series_outputIs
z!SeriesGroupBy._wrap_series_outputcCs|j||d�}|j|�S)a�
        Wraps the output of a SeriesGroupBy aggregation into the expected result.

        Parameters
        ----------
        output : Mapping[base.OutputKey, Union[Series, np.ndarray]]
            Data to wrap.

        Returns
        -------
        Series or DataFrame

        Notes
        -----
        In the vast majority of cases output will only contain one element.
        The exception is operations that expand dimensions, like ohlc.
        )r�rn)r��_reindex_output)rKr�rnr�rLrLrMr�rsz%SeriesGroupBy._wrap_aggregated_output)r�r\cCs4t|�dkst�|j||jjd�}t|t�s0t�|S)a�
        Wraps the output of a SeriesGroupBy aggregation into the expected result.

        Parameters
        ----------
        output : dict[base.OutputKey, Union[Series, np.ndarray]]
            Dict with a sole key of 0 and a value of the result values.

        Returns
        -------
        Series

        Notes
        -----
        output should always contain one element. It is specified as a dict
        for consistency with DataFrame methods and _wrap_aggregated_output.
        rm)r�rn)r��AssertionErrorr�r`rnrwr?)rKr�r�rLrLrM�_wrap_transformed_output�sz&SeriesGroupBy._wrap_transformed_outputF)rvr��not_indexed_samer\cs�t��dkr$�jjg�j�tjd�S|dk	s0t�td���fdd�}t|dt	�r�|�}�j
�jj||d��}|j�j
d�}�j|_|St|dttf�r��j�||d	�S�jj||��jd
�}�j
|�SdS)a�
        Wrap the output of SeriesGroupBy.apply into the expected result.

        Parameters
        ----------
        keys : Index
            Keys of groups that Series was grouped by.
        values : Optional[List[Any]]
            Applied output for each group.
        not_indexed_same : bool, default False
            Whether the applied outputs are not indexed the same as the group axes.

        Returns
        -------
        DataFrame or Series
        r)rIrn�dtypeN)r\cs8�jjdkr tj��jjd�}nt��jjdd�}|S)Nrm)r�r)rI)r|r}r;�from_tuplesr�r:)rn)rvrKrLrM�
_get_index�sz6SeriesGroupBy._wrap_applied_output.<locals>._get_index)rn)�dropna)r�)�datarnrI)r�r`r�ra�np�float64r�r:rwr�r�r��stack�observedrIr?r0�_concat_objects)rKrvr�r�r�rnr�rL)rvrKrM�_wrap_applied_output�s"z"SeriesGroupBy._wrap_applied_outputcOsRi}xH|D]@\}}||_||f|�|�}t|tttjf�rBtd��|||<q
W|S)NzMust produce aggregated value)rIrwr?r:r��ndarrayr)rKrgrhrir�rI�groupr�rLrLrMr��szSeriesGroupBy._aggregate_named)rJc
Os�|j|�p|}t|t�s4|j|f|�||d�|��S|tjkrTd|�d�}t|��n4|tjkrnt||�||�S|tj	kr�t||�||�St
j|dd��t||�||�}WdQRX|j|�S)N)rkrl�'z2' is not a valid function name for transform(name)r�T)
r{rwrx�_transform_generalr4�transform_kernel_allowlistr�cythonized_kernelsrP�transformation_kernelsr��temp_setattr�_transform_fast)rKrgrkrlrhri�msgr�rLrLrM�	transform�s




zSeriesGroupBy.transform�cythoncOs"t|�rt|||d�\}}t|j�}g}	x�|D]�\}
}tj|d|
�t|�rzt|�\}}
|||
f|��}|tkr�|t|<n||f|�|�}t|t	t
f�r�|j}|j|
�}|||�}|	j
|�q.W|	r�ddlm}||	�j�}n|jjtjd�}|jj}t|��r
t||�}|jj|_|jj|_|S)z2
        Transform with a non-str `func`.
        �groupby_transformrIr)rp)r�)rBrA�typer]r��__setattr__rCr@rwr1r2�_valuesr�r��pandas.core.reshape.concatrpZ
sort_indexr`r�r�r�r�r$rrIrn)rKrgrkrlrhri�
numba_func�	cache_keyrJr�rIr�r�rn�res�indexerZserrpr�r�rLrLrMr�s8







z SeriesGroupBy._transform_generalcCsH|jj\}}}|j|jjdd�}tj|j|�}|jj||jj	|jj
d�S)zf
        fast version of transform, only applicable to
        builtin/cythonizable functions
        F)r�)rnrI)r|�
group_info�reindex�result_index�
algorithms�take_1dr�r`r�rnrI)rKr��ids�_�ngroup�outrLrLrMr�5szSeriesGroupBy._transform_fastTcs�t�t�r���fdd��n���fdd��td��fdd��y��fdd��D�}Wn0ttfk
r�}ztd	�|�WYd
d
}~XnX�j||�}|S)a�
        Return a copy of a Series excluding elements from groups that
        do not satisfy the boolean criterion specified by func.

        Parameters
        ----------
        func : function
            To apply to each group. Should return True or False.
        dropna : Drop groups that do not pass the filter. True by default;
            if False, groups that evaluate False are filled with NaNs.

        Examples
        --------
        >>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
        ...                           'foo', 'bar'],
        ...                    'B' : [1, 2, 3, 4, 5, 6],
        ...                    'C' : [2.0, 5., 8., 1., 2., 9.]})
        >>> grouped = df.groupby('A')
        >>> df.groupby('A').B.filter(lambda x: x.mean() > 3.)
        1    2
        3    4
        5    6
        Name: B, dtype: int64

        Returns
        -------
        filtered : Series
        cst|�����S)N)rP)r�)rhrgrirLrM�<lambda>]sz&SeriesGroupBy.filter.<locals>.<lambda>cs�|f����S)NrL)r�)rhrgrirLrMr�_s)r\cs�|f|�|�}|ot|�S)N)r))r�rhri�b)�wrapperrLrM�true_and_notnabsz,SeriesGroupBy.filter.<locals>.true_and_notnacs"g|]\}}�|�r�j|��qSrL)r�)rrrIr�)rKr�rLrMrthsz(SeriesGroupBy.filter.<locals>.<listcomp>z'the filter must return a boolean resultN)rwrx�boolr�	TypeError�
_apply_filter)rKrgr�rhri�indices�err�filteredrL)rhrgrirKr�r�rM�filter?s
zSeriesGroupBy.filter)r�r\cCs�|jj\}}}|jj}tj|dd�\}}tj||f�}||}||}tjddtj	|dd�|dd�k�df}tjd|dd�|dd�kf}|d
k}	|r�d||<d||	<n&d||	tjd|	dd�f@<d||<tj
j||�jddd�}
t
|��r.|ddk�r(|
dd�}|tj|�}n|
}n|
dd�}|jj}t
|�t
|�k�rztjt
|�|
jd�|}}
|
|||<|jj|||jd	�}
|j|
dd
�S)z�
        Return number of unique elements in the group.

        Returns
        -------
        Series
            Number of unique values within each group.
        F)�sortrrmN�int64)r�)r�)rnrI)�
fill_value���r�r�r�r�)r|r�r`r�r��	factorizer��lexsort�r_�nonzero�add�reduceat�astyper�Zflatnonzeror��zerosr�r�rar�)rKr�r�r�r��codes�sorterr��inc�maskr�r��rir�rLrLrM�nuniqueps6	0"

zSeriesGroupBy.nuniquecs*|j�fdd��}|jdkr"|jS|j�S)Ncs|jf��S)N)�describe)r�)rirLrMr��sz(SeriesGroupBy.describe.<locals>.<lambda>rm)rfru�TZunstack)rKrir�rL)rirMr��s
zSeriesGroupBy.describecs�ddlm}ddlm}|dk	rBtj|�rB|jtj||||d�S|j	j
\}}	}	|jj}
|dk�|�|
�}}
|dkr�t
j|
dd�\}}dd	�}
n0|t|
�|dd
�}|jj}|j|jj�}dd	�}
t|j�r�tj|j|j|f�}ntj||f�}||||}}tjddtj|dd�|dd�k�df}|
|tdd��|
|tdd��k}tjd|f}d||<tjtjtj|df�d�}ttjtjj||�d��|j	j }�fd
d�|D�|
||�g}dd�|j	j!D�|g}|j	j"|j#g}|�r.|d d!k��j$��rd}n|��fdd�|D�}}|�r�|j%d�}tjtj|t&|�f�}|�r�||d"k}tjj'||d#��|��}n�|�}||}|�r�|dk�r�|�r�||�n||}tj|�r�|n||f�}|||d$|}|d%<|dk�r4t(|||dd�}t)|��r t*|�}|jj+|||j#d�Stj,t&|�dd��x<|dd&�D],}�tjd|dd�|dd'�kfO��qTW�j-�t&|d(�}�tjtj.|���tj/tj.��|�g}�j0�d|d)g}|||ddd�\}	}tj1|d*k||d�}|�r:tj|�r|n||df�}|||d+|}|d,<tj2tj2d���fdd���fdd�|dd-�D�}|j3|d.�t(|||dd�}t)|��r�t*|�}|jj+|||j#d�S)/Nr)�_get_join_indexers)�cut)�	normalizer��	ascending�binsrmT)r�cSs||S)NrL)�labr�rLrLrMr��sz,SeriesGroupBy.value_counts.<locals>.<lambda>)Zinclude_lowestcSs||jjdS)Nrmr�)Z_multiindexr�)r�r�rLrLrMr��s)Zrepeatscsg|]}�|��qSrLrL)rr�level_codes)�reprLrMrt�sz.SeriesGroupBy.value_counts.<locals>.<listcomp>cSsg|]
}|j�qSrL)Zgroup_index)rrZpingrLrLrMrt�sFcsg|]}|��qSrLrL)rrr�)r�rLrMrt�s�float)�levelsr�r��verify_integrity)rnrIr�)r��left)r��how)�	lev_codesr\cstj|���S)N)r��repeat)r)�diff�nbinrLrM�build_codessz/SeriesGroupBy.value_counts.<locals>.build_codescsg|]}�|��qSrLrL)rrr)rrLrMrt!sr�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�)4Zpandas.core.reshape.merger�Zpandas.core.reshape.tiler�r��iterablerfr?�value_countsr|r�r`r�r�r��cat�
categories�taker�r#r�r�r�rightr�r��slicerrrr�r�Zreconstructed_codes�	groupingsr�ra�allr�r�Zatr;r"rr�r��sum�arangeZtile�cumsum�wherer�r�)rKr�r�r�r�r�r�r�r�r�r�r��levZllabr�r�Zlchangesr�r�r�r�r��d�m�accr
�mir�Zncatrr
rL)rrr�rr�rMr	�s�

0 



,$
zSeriesGroupBy.value_countsc	Csr|jj\}}}|jj}|dkt|�@}t|�}|p6d}tj|||d�}|jj||jj	|j
dd�}|j|dd�S)z�
        Compute count of group, excluding missing values.

        Returns
        -------
        Series
            Count of values within each group.
        rmr)�	minlengthr�)rnrIr�)r�r�)r|r�r`r�r(r r�Zbincountr�r�rar�)	rKr�r��ngroupsr�r�rr�r�rLrLrM�count*s	zSeriesGroupBy.countcCs||�S)z return a pass thru rL)rKrgrLrLrM�_apply_to_column_groupbysCsz'SeriesGroupBy._apply_to_column_groupbysrm�padcsd�r|j����fdd��S�dkr,d�d�t|���d�}|j|jj�}|j��d�}||dS)	z=Calculate pct_change of each value to previous entry in groupcs|j����d�S)N)�periods�fill_method�limit�freq)�
pct_change)r�)r r"r!rrLrMr�Lsz*SeriesGroupBy.pct_change.<locals>.<lambda>Nrr)r!)rr"rm)rfrP�groupbyr|r��shift)rKrr r!r"ZfilledZfill_grpZshiftedrL)r r"r!rrMr#GszSeriesGroupBy.pct_change)N)F)T)T)FTFNT)rmrNN)4rR�
__module__�__qualname__r4�series_apply_allowlist�_apply_allowlistrr?r^rSrar�_agg_examples_docrr7�formatrfrr6r��aggryr
r�rr�r�rr:r0r�r�r�rrr�rr�r�rr8r�r�r�r�r�r�r	rrr#�
__classcell__rLrL)rjrMr[�sH%
4.'32

15
}r[cs�eZdZejZed�Zee	edd�dKddd�dd��Z
e
Zee
d�d	d
�ZdMeeeed
�dd�ZdOeeedd
�dd�Zed�dd�Zed�dd�ZdPdd�Zddd�dd�Zedd�ee�ddd�dd���Zeed �d!d"�Zd#d$�Zeeed%�d&d'�Z eed(�d)d*�Z!dQd+d,�Z"�fd-d.�Z#dRed/�d0d1�Z$ed�d2d3�Z%e&d�d4d5�Z'd6d7�Z(e)ej*e+e
e,j-ffe.e/ed8�d9d:�Z0e)ej*e+e
e,j-ffed;�d<d=�Z1d>e/ed?�d@dA�Z2dBdC�Z3dDdE�Z4dFdG�Z5dSedH�dIdJ�Z6e7Z8�Z9S)T�DataFrameGroupBya&
    Examples
    --------
    >>> df = pd.DataFrame(
    ...     {
    ...         "A": [1, 1, 2, 2],
    ...         "B": [1, 2, 3, 4],
    ...         "C": [0.362838, 0.227877, 1.267767, -0.562860],
    ...     }
    ... )

    >>> df
       A  B         C
    0  1  1  0.362838
    1  1  2  0.227877
    2  2  3  1.267767
    3  2  4 -0.562860

    The aggregation is for each column.

    >>> df.groupby('A').agg('min')
       B         C
    A
    1  1  0.227877
    2  3 -0.562860

    Multiple aggregations

    >>> df.groupby('A').agg(['min', 'max'])
        B             C
      min max       min       max
    A
    1   1   2  0.227877  0.362838
    2   3   4 -0.562860  1.267767

    Select a column for aggregation

    >>> df.groupby('A').B.agg(['min', 'max'])
       min  max
    A
    1    1    2
    2    3    4

    Different aggregations per column

    >>> df.groupby('A').agg({'B': ['min', 'max'], 'C': 'sum'})
        B             C
      min max       sum
    A
    1   1   2  0.590715
    2   3   4  0.704907

    To control the output names with different aggregations per column,
    pandas supports "named aggregation"

    >>> df.groupby("A").agg(
    ...     b_min=pd.NamedAgg(column="B", aggfunc="min"),
    ...     c_sum=pd.NamedAgg(column="C", aggfunc="sum"))
       b_min     c_sum
    A
    1      1  0.590715
    2      3  0.704907

    - The keywords are the *output* column names
    - The values are tuples whose first element is the column to select
      and the second element is the aggregation to apply to that column.
      Pandas provides the ``pandas.NamedAgg`` namedtuple with the fields
      ``['column', 'aggfunc']`` to make it clearer what the arguments are.
      As usual, the aggregation can be a callable or a string alias.

    See :ref:`groupby.aggregate.named` for more.r0)rdrJN)rkrlcOs�t|f|�\}}}}t|�r8|j|f|�||d�|��St|�}|j|f|�|�\}	}
|
dkrb|	S|	dk�rT|jjdkr�|j|f|�|�S|s�|r�|j|f|�|�}	n�|jdkr�|j|�}	n�y:|j	|g|jd�}	|	j
j|jj
j
g|	j
j�jd�|	_
Wn\tk
�r4}z dt|�k�r�|j|�}	WYdd}~Xn tk
�rR|j|�}	YnX|�rr|	jdd�|f}	||	_
|j�s�|j|	�tjt|	��|	_|	jdd�S)N)rkrlrm)Z_axisz
no resultsT)�datetimer�)r+rBr~r*Z
_aggregater|r}�_aggregate_frameruryrz�renamer]rI�nlevelsZ	droplevelrrx�AttributeError�ilocr��_insert_inaxis_grouper_inplacer�rr�rn�_convert)rKrgrkrlrhrir�rz�orderr�rr�rLrLrMr��sB


zDataFrameGroupBy.aggregate)r\ccs`|j}|jdkr|j}t|t�r4|j|jkr4|Vn(x&|j�D]\}}||jkrRq>|Vq>WdS)Nrm)r]rur�rwr?rI�
exclusionsr�)rKr`rqr�rLrLrMr^�s

z DataFrameGroupBy._iterate_slicesTrm)r�numeric_only�	min_countr\cCs$|j||||d�\}}|j||d�S)N)�altr9r:)r�)�_cython_agg_blocks�_wrap_agged_blocks)rKrr;r9r:�
agg_blocks�	agg_itemsrLrLrM�_cython_agg_general�sz$DataFrameGroupBy._cython_agg_generalzTuple[List[Block], Index]cs��j�}|r|jdd�}g}g}g}g}	g}
t�}�x:|jD�].}|}
|jj}y �jj|j|d|d�\}
}W�n.t	k
�r��dkr�|dks�t
�|j|�w<�j|j
|}|jddkr�|jdd�df}t|�jdd�}y|j��fd	d
��}
Wn"tk
�r|j|�w<Yn~Xtt|
�}
t|
jj�dk�rN|	j|�|
j|
�w<t|
jj�dk�sdt
�|
jjdj}
t|
tj��r�|
jdk�r�|
jdd
�}
YnXt|
t��s�t
�|
|k	�rXt|j|�}t|
|�}
|j�rNt|
tj��rN|
jdk�s|
jddk�st
�y t|j�j |
j!�|jjd�}
Wn&t"tfk
�rL|
jdd�}
YnX|j#|
�}|j|�|j|�q<W|�px|
�s�t$d��|	�rx�t%|	|
�D]t\}}
t|�|
jdk�s�t
�xPt&|�D]D\}}|jtj'|g|jd��|j|
jdd�|gfjjd��q�W�q�Wtj(|�}|j
j)tj*|��}|�rntj(|�}tj+t|��}tj,t|��}d||<||j-�|}d}x2|D]*}t|j�}||||�|_||7}�qxW||fS)NF)r�rm)rur:ZohlcrT)r�cs�|�jd�S)N)ru)ru)r�)r;rKrLrMr�6sz5DataFrameGroupBy._cython_agg_blocks.<locals>.<lambda>)r�zNo numeric types to aggregater�r�).�_get_data_to_aggregateZget_numeric_datar��blocks�mgr_locsZas_arrayr|r�r��NotImplementedErrorr�r�r`r��shaper4r9r�rr0r��_mgrrwr�r��ndim�reshaperr�rZis_extensionr�Z_from_sequenceZravelrr>r-r�r��array�concatenaterr�rr�r)rKrr;r9r:r�r>Z	new_itemsZ
deleted_itemsZsplit_itemsZsplit_framesZ	no_result�blockr��locsr�r`�sr�Z	agg_block�i�locr�r?ZdeletedZair��offset�blkrL)r;rKrMr<s�






 


.



z#DataFrameGroupBy._cython_agg_blocksc
Os�|jjdkrtd��|j}|j}i}||jkrZx`|D] \}}||f|�|�}	|	||<q4Wn6x4|jD]*}|j||d�}||f|�|�}	|	||<qbW|j||�S)NrmzNumber of keys must be 1)r`)	r|r}r�ru�_obj_with_exclusionsZ_info_axis_numberr�Z	get_group�_wrap_frame_output)
rKrgrhrirur`r�rIr�ZfresrLrLrMr0�s
z!DataFrameGroupBy._aggregate_framec
Os�|j}i}g}x�|D]�}||}t|||jd�}	|j|�}
y|	j|f|�|�||<Wn<tk
r�}z dt|�krt�|j|�wWYdd}~XqX|
rt|||�||<qW|j	}|r�|j
|�}|jj||d�S)N)�	selectionr|zMust produce aggregated value)rz)
rRr[r|Z_transform_should_castr�rrxr�rrzZdropr`r�)
rKrgrhrir`r�Z
cannot_agg�itemr�Zcolgrr�Zresult_columnsrLrLrM�_aggregate_item_by_item�s(



z(DataFrameGroupBy._aggregate_item_by_itemFcs�t|�dkr|jj|d�Sttj|�d�}|dkr<|jj�St|t�rV|j|||d�S|j	rd|j
jnd}t|t�r�|j
�}tf|dti����fdd�|D�}|d}t|tjttf�s�|j	�r�t|t��r,|jj|j�}tdd�|D��}	t|�d	k�o|jd	k}
|j�rL|
�r2|d|d_|j|||d�S|	�rLdd
lm}||�S|	�sb|j||dd�S|jj�}|jdk�r�dd
�|D�}
t|
�d	k�r�t|
�d|_tjdd�|D��}|jdk�r|}|jj�}|jdk�rdd
�|D�}
t|
�d	k�rt|
�d|_n|j}|}|j }|jj|||d�}n0|j	�sHt|||j!gd�}n|jj"|||j#d�S|j}|j$dk�r�|j%j&t'�j(��r�t)|�}n|j*dd�}|j	�s�|j+|�|j,|�S|jj"||d�SdS)Nr)rn)r�rocsg|]}|dk	r|n��qS)NrL)rrr�)�backuprLrMrt�sz9DataFrameGroupBy._wrap_applied_output.<locals>.<listcomp>css|]}|jVqdS)N)rn)rrr�rLrLrMr��sz8DataFrameGroupBy._wrap_applied_output.<locals>.<genexpr>rm)rpTcSsh|]
}|j�qSrL)rI)rr�vrLrLrM�	<setcomp>	sz8DataFrameGroupBy._wrap_applied_output.<locals>.<setcomp>cSsg|]}tj|��qSrL)r�Zasarray)rrrXrLrLrMrtscSsh|]
}|j�qSrL)rI)rrrXrLrLrMrYs)rnrz)rnrI�)r/)-r�r`r�r�r�Znot_nonerwr0r�r�r|r�r?Z_construct_axes_dictr/r�r�r�r:r]�	_get_axisrur<r2�squeezerIr�rprnr�r�Zvstackr�r_Z_constructor_slicedrarG�dtypesrfr'r��_recast_datetimelike_resultr6r5r�)rKrvr�r�Zfirst_not_noneZ	key_indexrirXZ
applied_indexZall_indexed_sameZsingular_seriesrprnr�Zstacked_valuesrzr��sorL)rWrMr��sx







z%DataFrameGroupBy._wrap_applied_outputr�cOs$ddlm}g}|j}|jj||jd�}	t|�rDt|||d�\}
}n|j|f|�|�\}}
�xp|	D�]f\}}t	j
|d|�t|�r�t|�\}}|
||f|��}|tkr�|
t|<|j
j||j|jd�}ndy|j||
|�\}}WnLtk
r�|j||�Stk
�r(}zd}t|�|�WYdd}~XnXt|t��r�tj|j��sHqbnl|jj|j��r�||gt|j�dd�}|j|_|j|_n2|j
jtj|jgt|j��j|j�|j|jd	�}|j|�qb|j|�qbW|jdk�r�|jn|j}|jdk�r�dnd}|||jd
d�}|j ||d
d�}|j!|�S)
Nr)rp)rur�rI)rnrzz3transform must return a scalar value for each grouprm)rzrnF)rur)rur�)"r�rprRr|�get_iteratorrurBrA�
_define_pathsr�r�rCr@r`r�rnrz�_choose_pathr��_transform_item_by_itemrrwr?r��prodrE�is_r�rJr�rHr�r�Z_set_result_index_ordered)rKrgrkrlrhrirpZappliedr`�genr�r��	fast_path�	slow_pathrIr�r�rnr��pathr�r��rZconcat_index�
other_axisZconcatenatedrLrLrMr�DsV

z#DataFrameGroupBy._transform_general)rJcOs�|j|�p|}t|t�s4|j|f|�||d�|��S|tjkrTd|�d�}t|��n4|tjkrnt||�||�S|tj	kr�t||�||�S|tj
kr�tj|dd��t||�||�}WdQRXt|t
�r�|jj|jj�r�|j|�S|j|f|�||d�|��S)N)rkrlr�z2' is not a valid function name for transform(name)r�T)r{rwrxr�r4r�rr�rPr�Zreduction_kernelsr�r�r0rz�equalsrRr�)rKrgrkrlrhrir�r�rLrLrMr��s&







zDataFrameGroupBy.transform)r�r\cCs||j}|jj\}}}|j|jjdd�}g}x8t|j�D]*\}}|jtj	|j
dd�|fj|��q6W|jj
j||j|jd�S)z6
        Fast transform path for aggregations
        F)r�N)rzrn)rRr|r�r�r�r�rzr�r�r�r4r�r`r�Z_from_arraysrn)rKr�r`r�r�r�r�rNrLrLrMr��s&z DataFrameGroupBy._transform_fastcsXt�t�r.���fdd�}����fdd�}n"���fdd�}����fdd�}||fS)Ncst|�����S)N)rP)r�)rhrgrirLrMr��sz0DataFrameGroupBy._define_paths.<locals>.<lambda>cs|j���fdd��jd�S)Ncst|�����S)N)rP)r�)rhrgrirLrMr��szBDataFrameGroupBy._define_paths.<locals>.<lambda>.<locals>.<lambda>)ru)rfru)r�)rhrgrirKrLrMr��scs�|f����S)NrL)r�)rhrgrirLrMr��scs|j���fdd��jd�S)Ncs�|f����S)NrL)r�)rhrgrirLrMr��szBDataFrameGroupBy._define_paths.<locals>.<lambda>.<locals>.<lambda>)ru)rfru)r�)rhrgrirKrLrMr��s)rwrx)rKrgrhrirgrhrL)rhrgrirKrMra�s
zDataFrameGroupBy._define_paths)rgrhr�cCs�|}||�}y||�}Wn,tk
r.�Yntk
rD||fSXt|t�sX||fS|jj|j�sn||fS|j|�r||}||fS)N)r��	Exceptionrwr0rzrl)rKrgrhr�rir�Zres_fastrLrLrMrb�s


zDataFrameGroupBy._choose_path)r`r\cCs�i}g}xJt|�D]>\}}y||j|�||<Wntk
rDYqX|j|�qWt|�dkrhtd��|j}t|�t|j�kr�|j|�}|jj||j	|d�S)Nrz)Transform function invalid for data types)rnrz)
r�r�r�r�r�rzrr`r�rn)rKr`r�r�ZindsrN�colrzrLrLrMrc�s
z(DataFrameGroupBy._transform_item_by_itemc
Os�g}|j}|jj||jd�}x�|D]�\}}	tj|	d|�||	f|�|�}
y|
j�}
Wntk
rhYnXt|
�s�t	|
�r�t
|
�r�|
r�t|
�r�|j|j
|��q"tdt|
�j�d���q"W|j||�S)a}
        Return a copy of a DataFrame excluding filtered elements.

        Elements from groups are filtered if they do not satisfy the
        boolean criterion specified by func.

        Parameters
        ----------
        func : function
            Function to apply to each subframe. Should return True or False.
        dropna : Drop groups that do not pass the filter. True by default;
            If False, groups that evaluate False are filled with NaNs.

        Returns
        -------
        filtered : DataFrame

        Notes
        -----
        Each subframe is endowed the attribute 'name' in case you need to know
        which group you are working on.

        Examples
        --------
        >>> df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
        ...                           'foo', 'bar'],
        ...                    'B' : [1, 2, 3, 4, 5, 6],
        ...                    'C' : [2.0, 5., 8., 1., 2., 9.]})
        >>> grouped = df.groupby('A')
        >>> grouped.filter(lambda x: x['B'].mean() > 3.)
             A  B    C
        1  bar  2  5.0
        3  bar  4  1.0
        5  bar  6  9.0
        )rurIzfilter function returned a z, but expected a scalar bool)r]r|r`rur�r�r\r3r!r&r(r)r�r�r�r�rRr�)rKrgr�rhrir�r`rfrIr�r�rLrLrMr��s $zDataFrameGroupBy.filtercs2t|t�r&t|�dkr&tjdtdd�t�j|�S)NrmzmIndexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.rZ)�
stacklevel)rwr�r��warnings�warn�
FutureWarningre�__getitem__)rKrs)rjrLrMrs?szDataFrameGroupBy.__getitem__)rGc	Csp|dkr6|dkr|j}t||j||j|j|j|jd�S|dkrd|dkrP|j|}t|||j|jd�Std��dS)a
        sub-classes to define
        return a sliced object

        Parameters
        ----------
        key : string / list of selections
        ndim : 1,2
            requested ndim of result
        subset : object, default None
            subset to act on
        rZN)rTr|r8r�r�rm)rTr|r�zinvalid ndim for _gotitem)r`r.r|r8r�r�r[r�)rKrsrGZsubsetrLrLrM�_gotitemLs"


zDataFrameGroupBy._gotitemcCsD|jjd}|jdkr,|jj||j|d�jS|jj||j|d�SdS)Nr)rnrz)r|r�rur`r�rzr�rn)rKr�r`r�rLrLrMrSns

z#DataFrameGroupBy._wrap_frame_outputcCs"|j}|jdkr|jjS|jSdS)Nrm)rRrur�rF)rKr`rLrLrMrAxs
z'DataFrameGroupBy._get_data_to_aggregatecCsdttt|jj|jj�dd�|jjD�f��}|j}x,|D]$\}}}|r8||kr8|jd||�q8WdS)NcSsg|]
}|j�qSrL)�in_axis)rr�grprLrLrMrt�szCDataFrameGroupBy._insert_inaxis_grouper_inplace.<locals>.<listcomp>r)	r��map�reversedr|r�Zget_group_levelsrrz�insert)rKr��iziprzrIrrurLrLrMr5sz/DataFrameGroupBy._insert_inaxis_grouper_inplace)r�rnr\cCs�dd�|j�D�}|jjd|j�j}tdd�|D�|d�}|jj|�}||_|j	sh|j
|�|j�}n
|jj
|_|jdkr�|j}|j|�S)a
        Wraps the output of DataFrameGroupBy aggregations into the expected result.

        Parameters
        ----------
        output : Mapping[base.OutputKey, Union[Series, np.ndarray]]
           Data to wrap.

        Returns
        -------
        DataFrame
        cSsi|]\}}||j�qSrL)r�)rrrsr�rLrLrMr��sz<DataFrameGroupBy._wrap_aggregated_output.<locals>.<dictcomp>rmcSsg|]
}|j�qSrL)rq)rrrsrLrLrMrt�sz<DataFrameGroupBy._wrap_aggregated_output.<locals>.<listcomp>)rI)r�rRr[rurIr:r`r�rzr�r5�_consolidater|r�rnr�r�)rKr�rnr�rIrzr�rLrLrMr��s



z(DataFrameGroupBy._wrap_aggregated_output)r�r\cCsDdd�|j�D�}tdd�|D��}|jj|�}||_|jj|_|S)a
        Wraps the output of DataFrameGroupBy transformations into the expected result.

        Parameters
        ----------
        output : Mapping[base.OutputKey, Union[Series, np.ndarray]]
            Data to wrap.

        Returns
        -------
        DataFrame
        cSsi|]\}}||j�qSrL)r�)rrrsr�rLrLrMr��sz=DataFrameGroupBy._wrap_transformed_output.<locals>.<dictcomp>css|]}|jVqdS)N)rq)rrrsrLrLrMr��sz<DataFrameGroupBy._wrap_transformed_output.<locals>.<genexpr>)r�r:r`r�rzrn)rKr�r�rzr�rLrLrMr��s
z)DataFrameGroupBy._wrap_transformed_outputzSequence[Block])rBr�r\cCs�|jsLtj|djjd�}t|||gd�}|jj|�}|j|�|j	�}n$|j
j}t|||gd�}|jj|�}|jdkr�|j
}|j|�jdd�S)Nrrm)ZaxesT)r/r�)r�r�rr�rEr=r`r�r5r{r|r�rur�r�r6)rKrBr�rnZmgrr�rLrLrMr=�s


z#DataFrameGroupBy._wrap_agged_blocksccsHxBt|jj�D]2\}}|t|jjdd�|f||j|jd�fVqWdS)N)rTr|r8)r�r]rzr[r4r|r8)rKrNZcolnamerLrLrM�_iterate_column_groupbys�sz)DataFrameGroupBy._iterate_column_groupbyscs0ddlm}|�fdd�|j�D�|jjdd�S)Nr)rpc3s|]\}}�|�VqdS)NrL)rrr�Zcol_groupby)rgrLrMr��sz=DataFrameGroupBy._apply_to_column_groupbys.<locals>.<genexpr>rm)rvru)r�rpr|r]rz)rKrgrprL)rgrMr�s
z*DataFrameGroupBy._apply_to_column_groupbyscsz|j�}|jj\�}��d	k��fdd�|jD�}dd�|jD�}��fdd�|D�}dd�t||�D�}|j||jd�S)
z�
        Compute count of group, excluding missing values.

        Returns
        -------
        DataFrame
            Count of values within each group.
        rmc3s&|]}�t|j�j|j�@VqdS)N)r(r�rHrE)rrrQ)r�rLrMr��sz)DataFrameGroupBy.count.<locals>.<genexpr>css|]}|jVqdS)N)rC)rrrQrLrLrMr�sc3s |]}tj|��dd�VqdS)rm)�labelsZmax_binruN)rZcount_level_2d)rrr�)r�rrLrMr�scSsg|]\}}t||d��qS))Z	placement)r>)rrr�rOrLrLrMrtsz*DataFrameGroupBy.count.<locals>.<listcomp>)r�r�)rAr|r�rBr�r=r�)rKr�r��valsrLZcountedrBrL)r�r�rrMr�s	zDataFrameGroupBy.count)r�cs�ddlm}�j}|j�j�}t|�}|dkr8|j}n|j}|��fdd�|�D�dd�}|dkrj|j}|j	|�j
|j	|�_
�js�tj
t|��|_�j|�|S)a�
        Return DataFrame with counts of unique elements in each position.

        Parameters
        ----------
        dropna : bool, default True
            Don't include NaN in the counts.

        Returns
        -------
        nunique: DataFrame

        Examples
        --------
        >>> df = pd.DataFrame({'id': ['spam', 'egg', 'egg', 'spam',
        ...                           'ham', 'ham'],
        ...                    'value1': [1, 5, 5, 2, 5, 5],
        ...                    'value2': list('abbaxy')})
        >>> df
             id  value1 value2
        0  spam       1      a
        1   egg       5      b
        2   egg       5      b
        3  spam       2      a
        4   ham       5      x
        5   ham       5      y

        >>> df.groupby('id').nunique()
              value1  value2
        id
        egg        1       1
        ham        1       2
        spam       2       1

        Check for rows with the same id but conflicting values:

        >>> df.groupby('id').filter(lambda g: (g.nunique() > 1).any())
             id  value1 value2
        0  spam       1      a
        3  spam       2      a
        4   ham       5      x
        5   ham       5      y
        r)rpcs&g|]\}}t||�jd�j���qS))rTr|)r[r|r�)rrrq�content)r�rKrLrMrtDsz,DataFrameGroupBy.nunique.<locals>.<listcomp>rm)ru)r�rprRZ_get_axis_numberru�intr�Ziterrowsr�r[r�r��ibaseZ
default_indexr�rnr5)rKr�rpr`Zaxis_numberrkZ	iter_funcr�rL)r�rKrMr�	s$,

zDataFrameGroupBy.nunique)Nr�)NTr�r�)NTr�)F)T)N)T):rRr&r'r4�dataframe_apply_allowlistr)rr*rr6r�r,rr?r^rxr�r�r0r@r<r0rVr�r�rrr8r�r�rarrbrcr�rsrtrSr=rAr5r
r�rr�r�rr:r�r�r=r|rrr�rDZboxplotr-rLrL)rjrMr.ZsTG
=
!
B&

@
"
 		Mr.)r�r\csd�j���fdd�tt�j��D�}x:|D]2}t�jdd�|fjdd�}|�jdd�|f<q*W�S)aL
    If we have date/time like in the original, then coerce dates
    as we are stacking can easily have object dtypes here.

    Parameters
    ----------
    result : DataFrame

    Returns
    -------
    DataFrame

    Notes
    -----
    - Assumes Groupby._selected_obj has ndim==2 and at least one
    datetimelike column
    cs g|]}t�jj|�r|�qSrL)r%r]r4)rrr�)r�rLrMrtnsz/_recast_datetimelike_result.<locals>.<listcomp>NF)Zconvert_numeric)r��ranger�rzrr4r�)r�Zobj_cols�nZ	convertedrL)r�rMr^Ys

r^)trQ�collectionsrrr��	functoolsr�textwraprZtypingrrrr	r
rrr
rrrrrrrpZnumpyr�Zpandas._libsrZpandas._typingrrZpandas.util._decoratorsrrrZpandas.core.dtypes.castrrrrrZpandas.core.dtypes.commonrr r!r"r#r$r%r&r'Zpandas.core.dtypes.missingr(r)Zpandas.core.aggregationr*r+r,Zpandas.core.algorithms�corer�Zpandas.core.baser-r.Zpandas.core.common�commonr�Zpandas.core.constructionr/Zpandas.core.framer0Zpandas.core.genericr1r2r3Zpandas.core.groupbyr4Zpandas.core.groupby.groupbyr5r6r7r8r9Zpandas.core.indexes.apir:r;r<Zpandas.core.indexes.base�indexesr�Zpandas.core.internalsr=r>Zpandas.core.seriesr?Zpandas.core.util.numba_r@rArBrCZpandas.plottingrDrErFrxZ	AggScalarZTypeVarrHrTrZr(r[r�r.r^rLrLrLrM�<module>sl@,
$=