HOME

sh-3ll 1.0
DIR:/usr/local/lib64/python3.6/site-packages/pandas/core/__pycache__/
Current File : //usr/local/lib64/python3.6/site-packages/pandas/core/__pycache__/frame.cpython-36.pyc
3

���h,��
@s�dZddlZddlmZddlZddlmZddlZddlmZddl	m
Z
mZmZm
Z
mZmZmZmZmZmZmZmZmZmZmZmZmZmZddlZddlZddlj Z ddl!m"Z"ddl#m$Z%m&Z&m'Z'dd	l(m)Z)dd
l*m+Z+m,Z,m-Z-m.Z.m/Z/m0Z0m1Z1m2Z2m3Z3m4Z4m5Z5ddl6m7Z7ddl8m9Z9dd
l:m;Z<ddl=m>Z>m?Z?m@Z@mAZAmBZBddlCmDZDmEZEmFZFddlGmHZHmIZImJZJmKZKmLZLmMZMmNZNmOZOmPZPmQZQmRZRmSZSmTZTddlUmVZVmWZWmXZXmYZYmZZZm[Z[m\Z\m]Z]m^Z^m_Z_m`Z`maZambZbmcZcmdZdmeZemfZfmgZgmhZhmiZimjZjddlkmlZlmmZmmnZnddlompZpmqZrmsZsmtZtddlumvZvddlwmxZxmyZyddlzm{Z{m|Z|ddl}m~Zddl�m�Z�ddl�m�Z�ddl�m�Z�m�Z�ddl�m�Z�ddl�m�Z�m�Z�m�Z�ddl�m�Z�ddl�m�Z�m�Z�ddl�m�Z�dd l�m�Z�m�Z�dd!l�m�Z�dd"l�m�Z�m�Z�m�Z�m�Z�m�Z�m�Z�m�Z�m�Z�m�Z�dd#l�m�Z�dd$l�m�Z�dd%l�m�Z�dd&l�m�Z�dd'l�m�Z�m�Z�dd(l�m�Z�ddl�Z�e�r�dd)l�m�Z�dd*l�m�Z�e�d+d,d-d.d/d0d1d2d3�Z�d4Z�d5Z�Gd6d,�d,e��Z�e�j��e�j��etj�e��etj�e��d7d8�Z�dS)9a#
DataFrame
---------
An efficient 2D container for potentially mixed-type time series or other
labeled data series.

Similar to its R counterpart, data.frame, except providing automatic data
alignment and a host of useful data manipulation methods having to do with the
labeling information
�N)�abc)�StringIO)�dedent)�IO�
TYPE_CHECKING�Any�AnyStr�Dict�	FrozenSet�Hashable�Iterable�Iterator�List�Mapping�Optional�Sequence�Set�Tuple�Type�Union�cast)�
get_option)�algos�lib�
properties)�
no_default)�	ArrayLike�Axes�Axis�Dtype�FilePathOrBuffer�FrameOrSeriesUnion�IndexKeyFunc�Label�Level�Renamer�ValueKeyFunc)�PY37)�import_optional_dependency)�function)�Appender�Substitution�deprecate_kwarg�doc�rewrite_axis_style_signature)�validate_axis_style_args�validate_bool_kwarg�validate_percentile)
�cast_scalar_to_array�coerce_to_dtypes�"construct_1d_arraylike_from_scalar�find_common_type�infer_dtype_from_scalar�invalidate_string_dtypes�maybe_cast_to_datetime�maybe_convert_platform�maybe_downcast_to_dtype�maybe_infer_to_datetimelike�maybe_upcast�maybe_upcast_putmask�validate_numeric_casting)�ensure_int64�ensure_platform_int�infer_dtype_from_object�
is_bool_dtype�is_dataclass�is_datetime64_any_dtype�is_dict_like�is_dtype_equal�is_extension_array_dtype�is_float_dtype�is_hashable�
is_integer�is_integer_dtype�is_iterator�is_list_like�is_named_tuple�is_object_dtype�	is_scalar�is_sequence�needs_i8_conversion�pandas_dtype)�isna�na_value_for_dtype�notna)�
algorithms�common�nanops�ops)�CachedAccessor)�reconstruct_func�relabel_result)�Categorical�ExtensionArray)�DatetimeLikeArrayMixin)�SparseFrameAccessor)�
extract_array)�NDFrame�_shared_docs)�base)�Index�ensure_index�ensure_index_from_sequences)�
DatetimeIndex)�
MultiIndex�maybe_droplevels)�PeriodIndex)�check_bool_indexer�convert_to_index_sliceable)�BlockManager)	�
arrays_to_mgr�dataclasses_to_dicts�get_names_from_index�	init_dict�init_ndarray�masked_rec_array_to_mgr�reorder_arrays�sanitize_index�	to_arrays)�melt)�Series)�ensure_key_mapped)�get_filepath_or_buffer)�console�format)�
DataFrameInfo)�DataFrameGroupBy)�Stylerzindex, columns�	DataFramez{0 or 'index', 1 or 'columns'}z�axis : {0 or 'index', 1 or 'columns'}, default 0
        If 0 or 'index': apply function to each column.
        If 1 or 'columns': apply function to each row.a�
        by : str or list of str
            Name or list of names to sort by.

            - if `axis` is 0 or `'index'` then `by` may contain index
              levels and/or column labels.
            - if `axis` is 1 or `'columns'` then `by` may contain column
              levels and/or index labels.

            .. versionchanged:: 0.23.0

               Allow specifying index or column level names.�zhlabels : array-like, optional
            New labels / index to conform the axis specified by 'axis' to.z�axis : int or str, optional
            Axis to target. Can be either the axis name ('index', 'columns')
            or number (0, 1).)�axes�klassZaxes_single_arg�axisZoptional_byZversionadded_to_excelZoptional_labelsZ
optional_axisz�numeric_only : boolean, default None
    Include only float, int, boolean data. If None, will attempt to use
    everything, then use only numeric data
a�
Merge DataFrame or named Series objects with a database-style join.

The join is done on columns or indexes. If joining columns on
columns, the DataFrame indexes *will be ignored*. Otherwise if joining indexes
on indexes or indexes on a column or columns, the index will be passed on.

Parameters
----------%s
right : DataFrame or named Series
    Object to merge with.
how : {'left', 'right', 'outer', 'inner'}, default 'inner'
    Type of merge to be performed.

    * left: use only keys from left frame, similar to a SQL left outer join;
      preserve key order.
    * right: use only keys from right frame, similar to a SQL right outer join;
      preserve key order.
    * outer: use union of keys from both frames, similar to a SQL full outer
      join; sort keys lexicographically.
    * inner: use intersection of keys from both frames, similar to a SQL inner
      join; preserve the order of the left keys.
on : label or list
    Column or index level names to join on. These must be found in both
    DataFrames. If `on` is None and not merging on indexes then this defaults
    to the intersection of the columns in both DataFrames.
left_on : label or list, or array-like
    Column or index level names to join on in the left DataFrame. Can also
    be an array or list of arrays of the length of the left DataFrame.
    These arrays are treated as if they are columns.
right_on : label or list, or array-like
    Column or index level names to join on in the right DataFrame. Can also
    be an array or list of arrays of the length of the right DataFrame.
    These arrays are treated as if they are columns.
left_index : bool, default False
    Use the index from the left DataFrame as the join key(s). If it is a
    MultiIndex, the number of keys in the other DataFrame (either the index
    or a number of columns) must match the number of levels.
right_index : bool, default False
    Use the index from the right DataFrame as the join key. Same caveats as
    left_index.
sort : bool, default False
    Sort the join keys lexicographically in the result DataFrame. If False,
    the order of the join keys depends on the join type (how keyword).
suffixes : list-like, default is ("_x", "_y")
    A length-2 sequence where each element is optionally a string
    indicating the suffix to add to overlapping column names in
    `left` and `right` respectively. Pass a value of `None` instead
    of a string to indicate that the column name from `left` or
    `right` should be left as-is, with no suffix. At least one of the
    values must not be None.
copy : bool, default True
    If False, avoid copy if possible.
indicator : bool or str, default False
    If True, adds a column to the output DataFrame called "_merge" with
    information on the source of each row. The column can be given a different
    name by providing a string argument. The column will have a Categorical
    type with the value of "left_only" for observations whose merge key only
    appears in the left DataFrame, "right_only" for observations
    whose merge key only appears in the right DataFrame, and "both"
    if the observation's merge key is found in both DataFrames.

validate : str, optional
    If specified, checks if merge is of specified type.

    * "one_to_one" or "1:1": check if merge keys are unique in both
      left and right datasets.
    * "one_to_many" or "1:m": check if merge keys are unique in left
      dataset.
    * "many_to_one" or "m:1": check if merge keys are unique in right
      dataset.
    * "many_to_many" or "m:m": allowed, but does not result in checks.

Returns
-------
DataFrame
    A DataFrame of the two merged objects.

See Also
--------
merge_ordered : Merge with optional filling/interpolation.
merge_asof : Merge on nearest keys.
DataFrame.join : Similar method using indices.

Notes
-----
Support for specifying index levels as the `on`, `left_on`, and
`right_on` parameters was added in version 0.23.0
Support for merging named Series objects was added in version 0.24.0

Examples
--------
>>> df1 = pd.DataFrame({'lkey': ['foo', 'bar', 'baz', 'foo'],
...                     'value': [1, 2, 3, 5]})
>>> df2 = pd.DataFrame({'rkey': ['foo', 'bar', 'baz', 'foo'],
...                     'value': [5, 6, 7, 8]})
>>> df1
    lkey value
0   foo      1
1   bar      2
2   baz      3
3   foo      5
>>> df2
    rkey value
0   foo      5
1   bar      6
2   baz      7
3   foo      8

Merge df1 and df2 on the lkey and rkey columns. The value columns have
the default suffixes, _x and _y, appended.

>>> df1.merge(df2, left_on='lkey', right_on='rkey')
  lkey  value_x rkey  value_y
0  foo        1  foo        5
1  foo        1  foo        8
2  foo        5  foo        5
3  foo        5  foo        8
4  bar        2  bar        6
5  baz        3  baz        7

Merge DataFrames df1 and df2 with specified left and right suffixes
appended to any overlapping columns.

>>> df1.merge(df2, left_on='lkey', right_on='rkey',
...           suffixes=('_left', '_right'))
  lkey  value_left rkey  value_right
0  foo           1  foo            5
1  foo           1  foo            8
2  foo           5  foo            5
3  foo           5  foo            8
4  bar           2  bar            6
5  baz           3  baz            7

Merge DataFrames df1 and df2, but raise an exception if the DataFrames have
any overlapping columns.

>>> df1.merge(df2, left_on='lkey', right_on='rkey', suffixes=(False, False))
Traceback (most recent call last):
...
ValueError: columns overlap but no suffix specified:
    Index(['value'], dtype='object')
cseZdZUdZddhejBZdZeedd�dd��Z	e
Zee
eje
g�BZeedhZeeed	d
��Z�d�eeeeeeed
�dd�Zeeed�dd��Zeeeefd�dd��Zeed�dd��Zeed�dd��Zed�dd�Z �d�eed�dd�Z!ed�dd�Z"ed�dd �Z#eed�d!d"�Z$e%d#d$d%d&d'�e%e&j'e&j(d(��d�ee)eee*eeee+ee*efeeee&j,ee&j-eeeeeeeeeeeeeeeeeeeeed,�d-d.���Z.ed/d�d0d1��Z/d2e0d3<e1e0d3�e2ee3e
fd�d4d5��Z4e1e0d3�e2ee3e
fd�d6d7��Z5e2ee3e
fd�d8d9�Z6�d�d;d<�Z7ed�d=d>�Z8d?d@�Z9dAdB�Z:dCdD�Z;e<�d�dd�dEdF��Z=dde>j?fee@jAdG�dHdI�ZBdJeCfdKdL�ZD�d�dd�dNdO�ZEe<�d�dd�dPdQ��ZF�d�e@jGd�dRdS�ZHe<�d�eeeddT�dUdV��ZIeJdWdXdY��d�e)eeKe3efeeeeeLjLeeeeKe3efeeee*e3e+eeMeefdfdd\�d]d^��ZNeJdWdXdY�dd�d_d`��ZOePe
jQeRdadbdc��d�eeSeeeeeedd�dedf��ZQeJdWdXdY��d�e)eTeeeeeeeeddi�djdk��ZUe%dldmdndod'�e%e&j'e&j(d(��d�dpdq���ZVe%ddrdsdtdudv�ePeWjX��d�eeeeSeeeee+eefeeddw�dxdy���ZX�d�e
d�dzd{�ZYdd|�eddG�d}d~�ZZedd�dd���Z[�d�eed��d�d��Z\ee]d��d�d��Z^e_e]d�d�d��Z`d�d��Zad�d��Zbd�d��Zc�d�ed��d�d��Zdd�d��Zeefd��d�d��Zgd�d��Zhd�d��Zied��d�d��Zjd�d��Zk�d�ed��d�d��Zld�d��Zmee
d��d�d��Zn�d�d�d��Zo�d�d�d��Zp�d�dd�d�d��Zq�d�dd�d�d��Zrdd�d�d��Zs�d�d�d��Zted�d���Zue@jAd�d�d��Zvd�d��Zwe@jxddfd�d��Zy�d�d�d��Zzdd�d�d��Z{ePej|feR��d�dd��fd�d
�Z|e1dÃe%feRd�d�d�dǜ��e1ej}j��d�e~edȜ�fd�dʄ
���Z}e%feR�e1ejj�e�dːd��d��d�d�e@jxf�d��d�g�dd��fd�dӄ���Z�dÇfd�dք	Z�e�dאdĐdŐdƐd�g��d�dddd)ddd�dۜee�ee�ee�ee~eeee�eeddܜ	�fd�dބ�Z�ePej�feR��d�edd��fd�d�
�Z�e3e
d��fd�d�Z�ePej�feR��dʇfd�d�	�Z�eKe3ee�e�ffed�d�d�Z�ePej�eRdad��d�dd��fd�d�
�Z��d�d�d�Z��d�ee+e�e*e�feee�e3edd�d�d�Z�ePej�eRdad�dd�d�d���Z�ePej�eRdad�dd�d�d���Z�ePej�eRdad�dd�d�d���Z�ePej�eRdad�dd�d�d���Z��d�d�d��Z��d�ee+e�e*e�fe+eefeeed�d��d�d�Z��d�ee+e�e*e�fe+eef�d�d��d�d�Z�e%feR�e1ej�j��d�e�d���d	�d
���Z��d�eeeeeee��d��d�d
�Z��d�ee*e3eee�d��d�d�Z��d�dd��d�d�Z��d�dd��d�d�Z��d�dd��d�d�Z��d�dd��d�d�Z��d�d�d��d�d�Z�dd��d�d�Z�e1�d�e1e0�d eR��d�de~eed�d!��f�d"�d#�
��Z��d�dd�d$��d%�d&�Z�dd�d$��d'�d(�Z��d�dd��d*�d+�Z�e1�d,�e1e0�d-eR�dd�dd)d)d)e?dd)f	eeeeee�d.�d/��d0�d1���Z��d2e0�d3<e%d�e1e0�d3��d�dd��d4�d5���Z��d6e0�d7<e%d�e1e0�d7��d�dd��d:�d;���Z��d�d<�d=�Z��d�e+eefed�d>��d?�d@�Z��d�dA�dB�Z�e1e0�dCeC�dD�dE�dC�dF���d�dd��dH�dI��Z�ePe
j��dJ�dK�de��dL��dM��d�ee~d�dN��dO�dP��Z��d�e+eeefeee�e��dQ��dR�dS�Z�e��dT�Z�e��dU�Z�ePe0�dVeRdaeR�dWe�e��dX�dY��d�dZ�d[��Z��d�f�d\�d]�	Z�e�Z�ePej�eRdaeR�dW�d^��d�dd��f�d_�d`�
�Z�d�ddff�da�db�Z�dd��dc�dd�Z��d�dd��de�df�Z��d�dd��dg�dh�Z��d�di�dj�Z�e%d�e1e��d�dk��d�dd��do�dp���Z��d�dd��dq�dr�Z��d�dd��dt�du�Z��d�eeeed�dv��dw�dx�Z��d�e
d��dy�dz�Z��d�d{�d|�Z��d��d}�d~�Z��d�e�d��d��d��Zd�e
d��d��d��ZÐd�e
d��d��d��ZĐd�e
d��d��d��Z�ee�d���d��d��ZƐd�dd��d��d��Zǐd��d��d��ZȐd�ee~ed�d���d��d��Zɐd�e~ed�d���d��d��Z�dd��d��d��Z�ddgZ�ej�d�d�d���Z�eKe~ef�d)Z�e�ẽZ�d�Z�dZ�e�j�d�d��d��ZՐd��e�j�d��d��d��Z֐d��eeKeefd��f�d��d���Z�eeKeefd��f�d��d���Z�eِd�e�j�j܃Z�e�j�j�Z�e�j�j�Z�e�de�Z�Z�S(�r�a�	
    Two-dimensional, size-mutable, potentially heterogeneous tabular data.

    Data structure also contains labeled axes (rows and columns).
    Arithmetic operations align on both row and column labels. Can be
    thought of as a dict-like container for Series objects. The primary
    pandas data structure.

    Parameters
    ----------
    data : ndarray (structured or homogeneous), Iterable, dict, or DataFrame
        Dict can contain Series, arrays, constants, or list-like objects.

        .. versionchanged:: 0.23.0
           If data is a dict, column order follows insertion-order for
           Python 3.6 and later.

        .. versionchanged:: 0.25.0
           If data is a list of dicts, column order follows insertion-order
           for Python 3.6 and later.

    index : Index or array-like
        Index to use for resulting frame. Will default to RangeIndex if
        no indexing information part of input data and no index provided.
    columns : Index or array-like
        Column labels to use for resulting frame. Will default to
        RangeIndex (0, 1, 2, ..., n) if no column labels are provided.
    dtype : dtype, default None
        Data type to force. Only a single dtype is allowed. If None, infer.
    copy : bool, default False
        Copy data from inputs. Only affects DataFrame / 2d ndarray input.

    See Also
    --------
    DataFrame.from_records : Constructor from tuples, also record arrays.
    DataFrame.from_dict : From dicts of Series, arrays, or dicts.
    read_csv : Read a comma-separated values (csv) file into DataFrame.
    read_table : Read general delimited file into DataFrame.
    read_clipboard : Read text from clipboard into DataFrame.

    Examples
    --------
    Constructing DataFrame from a dictionary.

    >>> d = {'col1': [1, 2], 'col2': [3, 4]}
    >>> df = pd.DataFrame(data=d)
    >>> df
       col1  col2
    0     1     3
    1     2     4

    Notice that the inferred dtype is int64.

    >>> df.dtypes
    col1    int64
    col2    int64
    dtype: object

    To enforce a single dtype:

    >>> df = pd.DataFrame(data=d, dtype=np.int8)
    >>> df.dtypes
    col1    int8
    col2    int8
    dtype: object

    Constructing DataFrame from numpy ndarray:

    >>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
    ...                    columns=['a', 'b', 'c'])
    >>> df2
       a  b  c
    0  1  2  3
    1  4  5  6
    2  7  8  9
    �columns�indexZ	dataframe)�returncCstS)N)r�)�self�r��3/tmp/pip-build-5_djhm0z/pandas/pandas/core/frame.py�_constructor�szDataFrame._constructor�sparsecCsdd�}|S)Nc_std��dS)NzNot supported for DataFrames!)�NotImplementedError)�args�kwargsr�r�r��constructor�sz5DataFrame._constructor_expanddim.<locals>.constructorr�)r�r�r�r�r��_constructor_expanddim�sz DataFrame._constructor_expanddimNF)r�r��dtype�copycs.�dkri��dk	r|j���t�t�r.�j�t�t�r��dkrh|dkrh�dkrh|dkrhtj|��dS|j�t�|d��|d�}�n�t�t�r�t	��|�d�}�nzt�t
j��r0ddlj
j
}t�|j�r�t��|�|�}nPt
j��}|j��rt�dd�\�}	�j�|	�|<n�j��t��|�|d�}�n�t�tjttf��r̈jj�r�t�jj�}
�fd	d
�|
D��|dk�rz|
}t	��|�d�}n<t�dd�dk	�r�t	�j�i�|�d�}nt��|�|d�}�nRt�tj ��rt�t!t"f��rt�tj#t$f��st���t%��dk�rt&�d��r*t'���t(�d��r�t�ddd
�d
k�r�t)�d��rp|dk�rp�dj*}t+�|�d�\}}t,|�}�dk�r�t�dt��r�t-���n2t�dt.��r�t/j0t%�d���nt/j0t%����t1||�|�d�}nt��|�|d�}nt	i�|�d�}�n�dk�s0|dk�r8t2d����sNt3�dd�\�}t4���r����fdd�t5t%|��D�}
t1|
|�|dd�}n�ytj6��|d�}Wn<t2t7fk
�r�}zt7d|���}||�WYdd}~XnX|j8dk�r�t2d��t9t%��t%|�f��d�}
t|
�||
jdd�}tj||�dS)NF)r�r�)r�r�r�)r�rT)r�)r�r�csi|]}�||�qSr�r�)�.0�k)�datar�r��
<dictcomp>�sz&DataFrame.__init__.<locals>.<dictcomp>�name�ndim�z*DataFrame constructor not properly called!)rScsg|]}t�t�����qSr�)r4�len)r��_)r�r�r�r�r��
<listcomp>sz&DataFrame.__init__.<locals>.<listcomp>z?DataFrame constructor called with incompatible data and dtype: ):Z_validate_dtype�
isinstancer��_mgrrorc�__init__Z	_init_mgr�dictrs�maZMaskedArrayZnumpy.ma.mrecords�mrecordsZ
MaskedRecordsruZgetmaskarray�anyr<Zsoften_maskr�rt�np�ndarrayrzrfr��names�list�getattrr�rr�str�bytesrr_r�rCrqrMrN�_fieldsrxrgrrr^�ibase�
default_indexrp�
ValueErrorr6rG�range�array�	TypeErrorr�r2)r�r�r�r�r�r��mgrr��mask�
fill_valueZdata_columns�arraysr��values�arr�err�excr�)r�r�r�r�r��s�


 





 $



zDataFrame.__init__cCs|j|jgS)a�
        Return a list representing the axes of the DataFrame.

        It has the row axis labels and column axis labels as the only members.
        They are returned in that order.

        Examples
        --------
        >>> df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
        >>> df.axes
        [RangeIndex(start=0, stop=2, step=1), Index(['col1', 'col2'],
        dtype='object')]
        )r�r�)r�r�r�r�r�8szDataFrame.axescCst|j�t|j�fS)a�
        Return a tuple representing the dimensionality of the DataFrame.

        See Also
        --------
        ndarray.shape

        Examples
        --------
        >>> df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
        >>> df.shape
        (2, 2)

        >>> df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4],
        ...                    'col3': [5, 6]})
        >>> df.shape
        (2, 3)
        )r�r�r�)r�r�r�r��shapeIszDataFrame.shapecCs.|jjr"tdd�|jjD��dkS|jSdS)a>
        Whether all the columns in a DataFrame have the same type.

        Returns
        -------
        bool

        See Also
        --------
        Index._is_homogeneous_type : Whether the object has a single
            dtype.
        MultiIndex._is_homogeneous_type : Whether all the levels of a
            MultiIndex have the same dtype.

        Examples
        --------
        >>> DataFrame({"A": [1, 2], "B": [3, 4]})._is_homogeneous_type
        True
        >>> DataFrame({"A": [1, 2], "B": [3.0, 4.0]})._is_homogeneous_type
        False

        Items with the same type but different sizes are considered
        different types.

        >>> DataFrame({
        ...    "A": np.array([1, 2], dtype=np.int32),
        ...    "B": np.array([1, 2], dtype=np.int64)})._is_homogeneous_type
        False
        cSsh|]
}|j�qSr�)r�)r��blockr�r�r��	<setcomp>sz1DataFrame._is_homogeneous_type.<locals>.<setcomp>r�N)r��any_extension_typesr��blocks�_is_mixed_type)r�r�r�r��_is_homogeneous_type_szDataFrame._is_homogeneous_typecCs|jjrdSt|jj�dkS)zY
        Can we transpose this DataFrame without creating any new array objects.
        Fr�)�_datar�r�r�)r�r�r�r��_can_fast_transpose�szDataFrame._can_fast_transposecCstd�}t|�|kS)z0
        Check length against max_rows.
        zdisplay.max_rows)rr�)r��max_rowsr�r�r��_repr_fits_vertical_�szDataFrame._repr_fits_vertical_)�ignore_widthr�cCs�tj�\}}td�}t|j�}|r*||ks@|rD|rD||dkrDdS|sRtj�rVdStd�dk	sjtj�rpd}ntd�}t�}|}|dk	r�|jdt	|t|���}ndS|j
|d	�|j�}	td
d�|	j
d�D��}
|
|kS)
a^
        Check if full repr fits in horizontal boundaries imposed by the display
        options width and max_columns.

        In case of non-interactive session, no boundaries apply.

        `ignore_width` is here so ipynb+HTML output can behave the way
        users expect. display.max_columns remains in effect.
        GH3541, GH3573
        zdisplay.max_columns�FTz
display.widthNr�zdisplay.max_rows)�bufcss|]}t|�VqdS)N)r�)r��lr�r�r��	<genexpr>�sz3DataFrame._repr_fits_horizontal_.<locals>.<genexpr>�
)r}�get_console_sizerr�r�Zin_interactive_sessionZin_ipython_frontendr�iloc�min�	to_string�getvalue�max�split)r�r��width�heightZmax_columnsZ
nb_columnsr�r��d�valueZ
repr_widthr�r�r��_repr_fits_horizontal_�s(
z DataFrame._repr_fits_horizontal_cCs"td�dk}|o |j�o|j�S)z=
        True if the repr should show the info view.
        zdisplay.large_repr�info)rr�r�)r�Zinfo_repr_optionr�r�r��
_info_repr�szDataFrame._info_reprc		Cs�td�}|j�r$|j|d�|j�Std�}td�}td�}td�}td�}td�rbtj�\}}nd	}|j|||||||d
�|j�S)zL
        Return a string representation for a particular DataFrame.
        r�)r�zdisplay.max_rowszdisplay.min_rowszdisplay.max_columnszdisplay.max_colwidthzdisplay.show_dimensionszdisplay.expand_frame_reprN)r�r��min_rows�max_cols�
line_width�max_colwidth�show_dimensions)rr�r�r�rr}r�r�)	r�r�r�r�r�r�r�r�r�r�r�r��__repr__�s*zDataFrame.__repr__cCs�|j�rHtd�}|j|d�|j�jddd�}|jddd�}d|d	Std
�r�td�}td�}td
�}td�}tj|dddddddddddd||||dddd�}|jdd�SdSdS)zp
        Return a html representation for a particular DataFrame.

        Mainly for IPython notebook.
        r�)r��<z&lt;r��>z&gt;z<pre>z</pre>zdisplay.notebook_repr_htmlzdisplay.max_rowszdisplay.min_rowszdisplay.max_columnszdisplay.show_dimensionsN�NaNT�.F)r��	col_space�na_rep�
formatters�float_format�sparsify�justify�index_names�headerr��	bold_rows�escaper�r�r�r��decimal�table_id�render_links)�notebook)	r�rr�r��replacer�fmt�DataFrameFormatter�to_html)r�r��valr�r�r�r��	formatterr�r�r��_repr_html_�sDzDataFrame._repr_html_zbool or sequencezkWrite out the column names. If a list of strings is given, it is assumed to be aliases for the column nameszint, list or dict of intz The minimum width of each column)Zheader_typer�Zcol_space_typer�)Z
shared_paramsZreturnsTr�r�)r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r��encodingr�cCs\ddlm}|d|��<tj|||||||	||
|||
|||||d�}|j||d�SQRXdS)a
        Render a DataFrame to a console-friendly tabular output.
        %(shared_params)s
        line_width : int, optional
            Width to wrap a line in characters.
        max_colwidth : int, optional
            Max width to truncate each column in characters. By default, no limit.

            .. versionadded:: 1.0.0
        encoding : str, default "utf-8"
            Set character encoding.

            .. versionadded:: 1.0
        %(returns)s
        See Also
        --------
        to_html : Convert DataFrame to HTML.

        Examples
        --------
        >>> d = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
        >>> df = pd.DataFrame(d)
        >>> print(df.to_string())
           col1  col2
        0     1     4
        1     2     5
        2     3     6
        r)�option_contextzdisplay.max_colwidth)r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�)r�r�N)�pandasr�r�r�r�)r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�"s*;zDataFrame.to_stringr�cCsddlm}||�S)a
        Returns a Styler object.

        Contains methods for building a styled HTML representation of the DataFrame.

        See Also
        --------
        io.formats.style.Styler : Helps style a DataFrame or Series according to the
            data with HTML and CSS.
        r)r�)�pandas.io.formats.styler�)r�r�r�r�r��stylewszDataFrame.stylea�
        Iterate over (column name, Series) pairs.

        Iterates over the DataFrame columns, returning a tuple with
        the column name and the content as a Series.

        Yields
        ------
        label : object
            The column names for the DataFrame being iterated over.
        content : Series
            The column entries belonging to each label, as a Series.

        See Also
        --------
        DataFrame.iterrows : Iterate over DataFrame rows as
            (index, Series) pairs.
        DataFrame.itertuples : Iterate over DataFrame rows as namedtuples
            of the values.

        Examples
        --------
        >>> df = pd.DataFrame({'species': ['bear', 'bear', 'marsupial'],
        ...                   'population': [1864, 22000, 80000]},
        ...                   index=['panda', 'polar', 'koala'])
        >>> df
                species   population
        panda   bear      1864
        polar   bear      22000
        koala   marsupial 80000
        >>> for label, content in df.items():
        ...     print(f'label: {label}')
        ...     print(f'content: {content}', sep='\n')
        ...
        label: species
        content:
        panda         bear
        polar         bear
        koala    marsupial
        Name: species, dtype: object
        label: population
        content:
        panda     1864
        polar    22000
        koala    80000
        Name: population, dtype: int64
        �itemsccsd|jjr4t|d�r4xL|jD]}||j|�fVqWn,x*t|j�D]\}}||j|dd�fVq@WdS)N�_item_cacher�)r�)r��	is_unique�hasattr�_get_item_cache�	enumerate�_ixs)r�r��ir�r�r�r�s
zDataFrame.itemsccs|j�EdHdS)N)r)r�r�r�r��	iteritems�szDataFrame.iteritemsccsD|j}|j}x2t|j|j�D] \}}||||d�}||fVqWdS)a5
        Iterate over DataFrame rows as (index, Series) pairs.

        Yields
        ------
        index : label or tuple of label
            The index of the row. A tuple for a `MultiIndex`.
        data : Series
            The data of the row as a Series.

        it : generator
            A generator that iterates over the rows of the frame.

        See Also
        --------
        DataFrame.itertuples : Iterate over DataFrame rows as namedtuples of the values.
        DataFrame.items : Iterate over (column name, Series) pairs.

        Notes
        -----
        1. Because ``iterrows`` returns a Series for each row,
           it does **not** preserve dtypes across the rows (dtypes are
           preserved across columns for DataFrames). For example,

           >>> df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])
           >>> row = next(df.iterrows())[1]
           >>> row
           int      1.0
           float    1.5
           Name: 0, dtype: float64
           >>> print(row['int'].dtype)
           float64
           >>> print(df['int'].dtype)
           int64

           To preserve dtypes while iterating over the rows, it is better
           to use :meth:`itertuples` which returns namedtuples of the values
           and which is generally faster than ``iterrows``.

        2. You should **never modify** something you are iterating over.
           This is not guaranteed to work in all cases. Depending on the
           data types, the iterator returns a copy and not a view, and writing
           to it will have no effect.
        )r�r�N)r��_constructor_sliced�zipr�r�)r�r�r�r��v�sr�r�r��iterrows�s
-zDataFrame.iterrows�Pandascs�g}t�j�}|r*|j�j�|jdd�|j�fdd�tt�j��D��tp`t�j�|dk}|dk	r�|r�t	j
||dd�}t|jt
|��St
|�S)	a^
        Iterate over DataFrame rows as namedtuples.

        Parameters
        ----------
        index : bool, default True
            If True, return the index as the first element of the tuple.
        name : str or None, default "Pandas"
            The name of the returned namedtuples or None to return regular
            tuples.

        Returns
        -------
        iterator
            An object to iterate over namedtuples for each row in the
            DataFrame with the first field possibly being the index and
            following fields being the column values.

        See Also
        --------
        DataFrame.iterrows : Iterate over DataFrame rows as (index, Series)
            pairs.
        DataFrame.items : Iterate over (column name, Series) pairs.

        Notes
        -----
        The column names will be renamed to positional names if they are
        invalid Python identifiers, repeated, or start with an underscore.
        On python versions < 3.7 regular tuples are returned for DataFrames
        with a large number of columns (>254).

        Examples
        --------
        >>> df = pd.DataFrame({'num_legs': [4, 2], 'num_wings': [0, 2]},
        ...                   index=['dog', 'hawk'])
        >>> df
              num_legs  num_wings
        dog          4          0
        hawk         2          2
        >>> for row in df.itertuples():
        ...     print(row)
        ...
        Pandas(Index='dog', num_legs=4, num_wings=0)
        Pandas(Index='hawk', num_legs=2, num_wings=2)

        By setting the `index` parameter to False we can remove the index
        as the first element of the tuple:

        >>> for row in df.itertuples(index=False):
        ...     print(row)
        ...
        Pandas(num_legs=4, num_wings=0)
        Pandas(num_legs=2, num_wings=2)

        With the `name` parameter set we set a custom name for the yielded
        namedtuples:

        >>> for row in df.itertuples(name='Animal'):
        ...     print(row)
        ...
        Animal(Index='dog', num_legs=4, num_wings=0)
        Animal(Index='hawk', num_legs=2, num_wings=2)
        rrfc3s |]}�jdd�|fVqdS)N)r�)r�r�)r�r�r�r�@sz'DataFrame.itertuples.<locals>.<genexpr>�NT)�rename)r�r��appendr��insert�extendr�r�r'�collections�
namedtuple�map�_maker
)r�r�r�r��fieldsZcan_return_named_tuplesZ	itertupler�)r�r��
itertuples�s@
"zDataFrame.itertuplescCs
t|j�S)zI
        Returns length of info axis, but here we use the index.
        )r�r�)r�r�r�r��__len__KszDataFrame.__len__cCs\t|ttf�rr|jj|j�}t|�t|j�ks@t|�t|j�krHtd��|j|dd�}|j|dd�}|j	}|j
}n@|}|j	}tj|�}|j
d|j
dkr�td|j
�d|j
����t|t�r�|jtj||�|j|jd	�St|t�r�|jtj||�|jd
�St|tjtf��rFtj||�}|jdk�r4|j||jd
�S|j||jd
�Sntdt|�����d
S)a�
        Compute the matrix multiplication between the DataFrame and other.

        This method computes the matrix product between the DataFrame and the
        values of an other Series, DataFrame or a numpy array.

        It can also be called using ``self @ other`` in Python >= 3.5.

        Parameters
        ----------
        other : Series, DataFrame or array-like
            The other object to compute the matrix product with.

        Returns
        -------
        Series or DataFrame
            If other is a Series, return the matrix product between self and
            other as a Series. If other is a DataFrame or a numpy.array, return
            the matrix product of self and other in a DataFrame of a np.array.

        See Also
        --------
        Series.dot: Similar method for Series.

        Notes
        -----
        The dimensions of DataFrame and other must be compatible in order to
        compute the matrix multiplication. In addition, the column names of
        DataFrame and the index of other must contain the same values, as they
        will be aligned prior to the multiplication.

        The dot method for Series computes the inner product, instead of the
        matrix product here.

        Examples
        --------
        Here we multiply a DataFrame with a Series.

        >>> df = pd.DataFrame([[0, 1, -2, -1], [1, 1, 1, 1]])
        >>> s = pd.Series([1, 1, 2, 1])
        >>> df.dot(s)
        0    -4
        1     5
        dtype: int64

        Here we multiply a DataFrame with another DataFrame.

        >>> other = pd.DataFrame([[0, 1], [1, 2], [-1, -1], [2, 0]])
        >>> df.dot(other)
            0   1
        0   1   4
        1   2   2

        Note that the dot method give the same result as @

        >>> df @ other
            0   1
        0   1   4
        1   2   2

        The dot method works also if other is an np.array.

        >>> arr = np.array([[0, 1], [1, 2], [-1, -1], [2, 0]])
        >>> df.dot(arr)
            0   1
        0   1   4
        1   2   2

        Note how shuffling of the objects does not change the result.

        >>> s2 = s.reindex([1, 0, 2, 3])
        >>> df.dot(s2)
        0    -4
        1     5
        dtype: int64
        zmatrices are not alignedF)r�r�)r�r�r�rzDot product shape mismatch, z vs )r�r�)r�r�zunsupported type: N)r�rzr�r��unionr�r�r��reindexr��_valuesr��asarrayr�r��dotr	r�rfr�r��type)r��otherrX�left�rightZlvalsZrvals�resultr�r�r�rQs2M$


z
DataFrame.dotcCs
|j|�S)zQ
        Matrix multiplication using binary `@` operator in Python>=3.5.
        )r)r�r!r�r�r��
__matmul__�szDataFrame.__matmul__cCs|jjtj|��jS)zQ
        Matrix multiplication using binary `@` operator in Python>=3.5.
        )�Trr��	transpose)r�r!r�r�r��__rmatmul__�szDataFrame.__rmatmul__cCs�d}|j�}|dkr`t|�dkr�tt|j��dttf�rDt|�}q�t|j��t|j��}}n"|dkrz|dk	r�t	d��nt	d��|||||d�S)a*
        Construct DataFrame from dict of array-like or dicts.

        Creates DataFrame object from dictionary by columns or by index
        allowing dtype specification.

        Parameters
        ----------
        data : dict
            Of the form {field : array-like} or {field : dict}.
        orient : {'columns', 'index'}, default 'columns'
            The "orientation" of the data. If the keys of the passed dict
            should be the columns of the resulting DataFrame, pass 'columns'
            (default). Otherwise if the keys should be rows, pass 'index'.
        dtype : dtype, default None
            Data type to force, otherwise infer.
        columns : list, default None
            Column labels to use when ``orient='index'``. Raises a ValueError
            if used with ``orient='columns'``.

            .. versionadded:: 0.23.0

        Returns
        -------
        DataFrame

        See Also
        --------
        DataFrame.from_records : DataFrame from structured ndarray, sequence
            of tuples or dicts, or DataFrame.
        DataFrame : DataFrame object creation using constructor.

        Examples
        --------
        By default the keys of the dict become the DataFrame columns:

        >>> data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
        >>> pd.DataFrame.from_dict(data)
           col_1 col_2
        0      3     a
        1      2     b
        2      1     c
        3      0     d

        Specify ``orient='index'`` to create the DataFrame using dictionary
        keys as rows:

        >>> data = {'row_1': [3, 2, 1, 0], 'row_2': ['a', 'b', 'c', 'd']}
        >>> pd.DataFrame.from_dict(data, orient='index')
               0  1  2  3
        row_1  3  2  1  0
        row_2  a  b  c  d

        When using the 'index' orientation, the column names can be
        specified manually:

        >>> pd.DataFrame.from_dict(data, orient='index',
        ...                        columns=['A', 'B', 'C', 'D'])
               A  B  C  D
        row_1  3  2  1  0
        row_2  a  b  c  d
        Nr�rr�z2cannot use columns parameter with orient='columns'z*only recognize index or columns for orient)r�r�r�)
�lowerr�r�r�r�rzr��_from_nested_dict�keysr�)�clsr��orientr�r�r�r�r�r��	from_dict�s@

zDataFrame.from_dict)r�r�cCs<|j�|jj|j|||d�}|j|k	r8tj||dd�}|S)ax
        Convert the DataFrame to a NumPy array.

        .. versionadded:: 0.24.0

        By default, the dtype of the returned array will be the common NumPy
        dtype of all types in the DataFrame. For example, if the dtypes are
        ``float16`` and ``float32``, the results dtype will be ``float32``.
        This may require copying data and coercing values, which may be
        expensive.

        Parameters
        ----------
        dtype : str or numpy.dtype, optional
            The dtype to pass to :meth:`numpy.asarray`.
        copy : bool, default False
            Whether to ensure that the returned value is not a view on
            another array. Note that ``copy=False`` does not *ensure* that
            ``to_numpy()`` is no-copy. Rather, ``copy=True`` ensure that
            a copy is made, even if not strictly necessary.
        na_value : Any, optional
            The value to use for missing values. The default value depends
            on `dtype` and the dtypes of the DataFrame columns.

            .. versionadded:: 1.1.0

        Returns
        -------
        numpy.ndarray

        See Also
        --------
        Series.to_numpy : Similar method for Series.

        Examples
        --------
        >>> pd.DataFrame({"A": [1, 2], "B": [3, 4]}).to_numpy()
        array([[1, 3],
               [2, 4]])

        With heterogeneous data, the lowest common type will have to
        be used.

        >>> df = pd.DataFrame({"A": [1, 2], "B": [3.0, 4.5]})
        >>> df.to_numpy()
        array([[1. , 3. ],
               [2. , 4.5]])

        For a mix of numeric and non-numeric types, the output array will
        have object dtype.

        >>> df['C'] = pd.date_range('2000', periods=2)
        >>> df.to_numpy()
        array([[1, 3.0, Timestamp('2000-01-01 00:00:00')],
               [2, 4.5, Timestamp('2000-01-02 00:00:00')]], dtype=object)
        )r'r�r��na_valueF)r�r�)�_consolidate_inplacer�Zas_array�_AXIS_REVERSEDr�r�r�)r�r�r�r/r$r�r�r��to_numpys;
zDataFrame.to_numpyr�c	s��jjstjdtdd�tj���|j�}|jd#�r�|d$kr�tjdt	�|jd�rXd	}nN|jd�rhd
}n>|jd�rxd}n.|jd�r�d}n|jd�r�d
}n|jd�r�d}|d	krȈ�fdd��j
�D��S|d
k�r�dd��j
�D��S|dk�r,�d�jj�fd�jj�fddd��j
ddd�D�ff�S|dk�rL�dd��j
�D��S|d
k�r��jj���fdd��j
ddd�D�}�fdd�|D�S|dk�rȈjj�s�td����fdd��j
dd �D��Std!|�d"���dS)%ai
        Convert the DataFrame to a dictionary.

        The type of the key-value pairs can be customized with the parameters
        (see below).

        Parameters
        ----------
        orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
            Determines the type of the values of the dictionary.

            - 'dict' (default) : dict like {column -> {index -> value}}
            - 'list' : dict like {column -> [values]}
            - 'series' : dict like {column -> Series(values)}
            - 'split' : dict like
              {'index' -> [index], 'columns' -> [columns], 'data' -> [values]}
            - 'records' : list like
              [{column -> value}, ... , {column -> value}]
            - 'index' : dict like {index -> {column -> value}}

            Abbreviations are allowed. `s` indicates `series` and `sp`
            indicates `split`.

        into : class, default dict
            The collections.abc.Mapping subclass used for all Mappings
            in the return value.  Can be the actual class or an empty
            instance of the mapping type you want.  If you want a
            collections.defaultdict, you must pass it initialized.

        Returns
        -------
        dict, list or collections.abc.Mapping
            Return a collections.abc.Mapping object representing the DataFrame.
            The resulting transformation depends on the `orient` parameter.

        See Also
        --------
        DataFrame.from_dict: Create a DataFrame from a dictionary.
        DataFrame.to_json: Convert a DataFrame to JSON format.

        Examples
        --------
        >>> df = pd.DataFrame({'col1': [1, 2],
        ...                    'col2': [0.5, 0.75]},
        ...                   index=['row1', 'row2'])
        >>> df
              col1  col2
        row1     1  0.50
        row2     2  0.75
        >>> df.to_dict()
        {'col1': {'row1': 1, 'row2': 2}, 'col2': {'row1': 0.5, 'row2': 0.75}}

        You can specify the return orientation.

        >>> df.to_dict('series')
        {'col1': row1    1
                 row2    2
        Name: col1, dtype: int64,
        'col2': row1    0.50
                row2    0.75
        Name: col2, dtype: float64}

        >>> df.to_dict('split')
        {'index': ['row1', 'row2'], 'columns': ['col1', 'col2'],
         'data': [[1, 0.5], [2, 0.75]]}

        >>> df.to_dict('records')
        [{'col1': 1, 'col2': 0.5}, {'col1': 2, 'col2': 0.75}]

        >>> df.to_dict('index')
        {'row1': {'col1': 1, 'col2': 0.5}, 'row2': {'col1': 2, 'col2': 0.75}}

        You can also specify the mapping type.

        >>> from collections import OrderedDict, defaultdict
        >>> df.to_dict(into=OrderedDict)
        OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2)])),
                     ('col2', OrderedDict([('row1', 0.5), ('row2', 0.75)]))])

        If you want a `defaultdict`, you need to initialize it:

        >>> dd = defaultdict(list)
        >>> df.to_dict('records', into=dd)
        [defaultdict(<class 'list'>, {'col1': 1, 'col2': 0.5}),
         defaultdict(<class 'list'>, {'col1': 2, 'col2': 0.75})]
        z?DataFrame columns are not unique, some columns will be omitted.r�)�
stacklevelr�r�r�rrr�r��seriesr��recordsr�z�Using short name for 'orient' is deprecated. Only the options: ('dict', list, 'series', 'split', 'records', 'index') will be used in a future version. Use one of the above to silence this warning.�spc3s |]\}}||j��fVqdS)N)�to_dict)r�r�r)�intor�r�r��sz$DataFrame.to_dict.<locals>.<genexpr>css|]\}}||j�fVqdS)N)�tolist)r�r�rr�r�r�r��sr�r�cSsg|]}tttj|���qSr�)r�r�com�maybe_box_datetimelike)r��tr�r�r�r��sz%DataFrame.to_dict.<locals>.<listcomp>FN)r�r�css |]\}}|tj|�fVqdS)N)r;r<)r�r�rr�r�r�r��sc3s|]}tt�|��VqdS)N)r�r
)r��row)r�r�r�r��scs"g|]}�dd�|j�D���qS)css |]\}}|tj|�fVqdS)N)r;r<)r�r�rr�r�r�r�sz/DataFrame.to_dict.<locals>.<listcomp>.<genexpr>)r)r�r>)�into_cr�r�r�sz2DataFrame index must be unique for orient='index'.c3s.|]&}|dtt�j|dd���fVqdS)rr�N)r�r
r�)r�r=)r�r�r�r�	s)r�zorient 'z' not understood)r�r�rr4r>r5r�r�r�r�r6)r�r�warnings�warn�UserWarningr;Zstandardize_mappingr)�
startswith�
FutureWarningrr�r:rr�)r�r-r9Zrowsr�)r�r9r?r�r�r8csnW
















zDataFrame.to_dict�failc
Cs0ddlm}|j||||||||||	|
d�dS)aD
        Write a DataFrame to a Google BigQuery table.

        This function requires the `pandas-gbq package
        <https://pandas-gbq.readthedocs.io>`__.

        See the `How to authenticate with Google BigQuery
        <https://pandas-gbq.readthedocs.io/en/latest/howto/authentication.html>`__
        guide for authentication instructions.

        Parameters
        ----------
        destination_table : str
            Name of table to be written, in the form ``dataset.tablename``.
        project_id : str, optional
            Google BigQuery Account project ID. Optional when available from
            the environment.
        chunksize : int, optional
            Number of rows to be inserted in each chunk from the dataframe.
            Set to ``None`` to load the whole dataframe at once.
        reauth : bool, default False
            Force Google BigQuery to re-authenticate the user. This is useful
            if multiple accounts are used.
        if_exists : str, default 'fail'
            Behavior when the destination table exists. Value can be one of:

            ``'fail'``
                If table exists raise pandas_gbq.gbq.TableCreationError.
            ``'replace'``
                If table exists, drop it, recreate it, and insert data.
            ``'append'``
                If table exists, insert data. Create if does not exist.
        auth_local_webserver : bool, default False
            Use the `local webserver flow`_ instead of the `console flow`_
            when getting user credentials.

            .. _local webserver flow:
                https://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_local_server
            .. _console flow:
                https://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html#google_auth_oauthlib.flow.InstalledAppFlow.run_console

            *New in version 0.2.0 of pandas-gbq*.
        table_schema : list of dicts, optional
            List of BigQuery table fields to which according DataFrame
            columns conform to, e.g. ``[{'name': 'col1', 'type':
            'STRING'},...]``. If schema is not provided, it will be
            generated according to dtypes of DataFrame columns. See
            BigQuery API documentation on available names of a field.

            *New in version 0.3.1 of pandas-gbq*.
        location : str, optional
            Location where the load job should run. See the `BigQuery locations
            documentation
            <https://cloud.google.com/bigquery/docs/dataset-locations>`__ for a
            list of available locations. The location must match that of the
            target dataset.

            *New in version 0.5.0 of pandas-gbq*.
        progress_bar : bool, default True
            Use the library `tqdm` to show the progress bar for the upload,
            chunk by chunk.

            *New in version 0.5.0 of pandas-gbq*.
        credentials : google.auth.credentials.Credentials, optional
            Credentials for accessing Google APIs. Use this parameter to
            override default credentials, such as to use Compute Engine
            :class:`google.auth.compute_engine.Credentials` or Service
            Account :class:`google.oauth2.service_account.Credentials`
            directly.

            *New in version 0.8.0 of pandas-gbq*.

            .. versionadded:: 0.24.0

        See Also
        --------
        pandas_gbq.to_gbq : This function in the pandas-gbq library.
        read_gbq : Read a DataFrame from Google BigQuery.
        r)�gbq)	�
project_id�	chunksize�reauth�	if_exists�auth_local_webserver�table_schema�location�progress_bar�credentialsN)Z	pandas.iorF�to_gbq)r�Zdestination_tablerGrHrIrJrKrLrMrNrOrFr�r�r�rPs\zDataFrame.to_gbqcs�|dk	rt|�}t��r�|dkr&|�Syt��}Wntk
rN|||d�SXd}t|d�rl|jjrl|j}|g}	|dkr�|	�7}	n|	jtj	�|d��|dk	r�t
j|	|d��n|	�t�t
��r>|dkr�tt���}��fdd�|D��nNg�g�x4�j�D](\}
}|
|k�r�j|
��j|��qWt��|�\��njt�t
jtf��rvt�|�\�}|dk	�rpt|�}|�n2t�||d	�\��t���|dk	�r�t|�}n�}|dk�r�t�}nt|�}d}|dk	�rzt|t��s�t|d
��r,|j|�}
|j|�t��dk�rt�|
|d�}ntg|d�}nNy��fdd�|D�}Wnttfk
�rb|}YnXt||d
�}|j|�t|��r҇fdd�|D�}�fdd�|D���fdd�t ��D���j!|��|j!|�}t"��||�}||�S)a


        Convert structured or record ndarray to DataFrame.

        Creates a DataFrame object from a structured ndarray, sequence of
        tuples or dicts, or DataFrame.

        Parameters
        ----------
        data : structured ndarray, sequence of tuples or dicts, or DataFrame
            Structured input data.
        index : str, list of fields, array-like
            Field of array to use as the index, alternately a specific set of
            input labels to use.
        exclude : sequence, default None
            Columns or fields to exclude.
        columns : sequence, default None
            Column names to use. If the passed data do not have names
            associated with them, this argument provides names for the
            columns. Otherwise this argument indicates the order of the columns
            in the result (any names not found in the data will become all-NA
            columns).
        coerce_float : bool, default False
            Attempt to convert values of non-string, non-numeric objects (like
            decimal.Decimal) to floating point, useful for SQL result sets.
        nrows : int, default None
            Number of rows to read if data is an iterator.

        Returns
        -------
        DataFrame

        See Also
        --------
        DataFrame.from_dict : DataFrame from dict of array-like or dicts.
        DataFrame : DataFrame object creation using constructor.

        Examples
        --------
        Data can be provided as a structured ndarray:

        >>> data = np.array([(3, 'a'), (2, 'b'), (1, 'c'), (0, 'd')],
        ...                 dtype=[('col_1', 'i4'), ('col_2', 'U1')])
        >>> pd.DataFrame.from_records(data)
           col_1 col_2
        0      3     a
        1      2     b
        2      1     c
        3      0     d

        Data can be provided as a list of dicts:

        >>> data = [{'col_1': 3, 'col_2': 'a'},
        ...         {'col_1': 2, 'col_2': 'b'},
        ...         {'col_1': 1, 'col_2': 'c'},
        ...         {'col_1': 0, 'col_2': 'd'}]
        >>> pd.DataFrame.from_records(data)
           col_1 col_2
        0      3     a
        1      2     b
        2      1     c
        3      0     d

        Data can be provided as a list of tuples with corresponding columns:

        >>> data = [(3, 'a'), (2, 'b'), (1, 'c'), (0, 'd')]
        >>> pd.DataFrame.from_records(data, columns=['col_1', 'col_2'])
           col_1 col_2
        0      3     a
        1      2     b
        2      1     c
        3      0     d
        Nr)r�r�r�r�)r�csg|]}�|�qSr�r�)r�r�)r�r�r�r��sz*DataFrame.from_records.<locals>.<listcomp>)�coerce_float�__iter__)r�csg|]}��j|��qSr�)�get_loc)r��field)�arr_columnsr�r�r�r�s)r�csg|]}|�kr|�qSr�r�)r��x)rUr�r�r� scsg|]}�j|��qSr�)rS)r��col)rUr�r�r�!scsg|]\}}|�kr|�qSr�r�)r�rr)�	to_remover�r�r�"s)#rgrL�next�
StopIterationrr�r�r�	itertools�islicer�r�r�r��sortedrrrvr�r�rx�setr�rS�addr�rf�KeyErrorr�rh�updater�r�droprp)r,r�r��excluder�rQZnrowsZ	first_rowr�r�r�r�result_indexrZ
index_dataZarr_excluder�r�)rUr�r�rXr��from_records|s�S














zDataFrame.from_recordscs|r�t�jt�r*tttjt�jj���}n
�jj	g}|�fdd�t
t�j��D�}d}t�jj
�}t�jt�r�xFt|�D]&\}}	|	dkrzd|��||<|d7}qzWn|ddkr�dg}dd�tj|�j�D�}
n0�fd	d�t
t�j��D�}d
d��jD�}
g}t|�}g}x�t|�D]�\}}
|}||k�r<|}||}n||8}|}�j|}t|��r�||k�rp||}n||k�r�||}nd}|dk�r�|j|
j�nPt|ttjtf��r�|j|�n0||k�r�dnd}d
|�d|�d|��}t|���qWtjj||
|d�d�S)a�
        Convert DataFrame to a NumPy record array.

        Index will be included as the first field of the record array if
        requested.

        Parameters
        ----------
        index : bool, default True
            Include index in resulting record array, stored in 'index'
            field or using the index label, if set.
        column_dtypes : str, type, dict, default None
            .. versionadded:: 0.24.0

            If a string or type, the data type to store all columns. If
            a dictionary, a mapping of column names and indices (zero-indexed)
            to specific data types.
        index_dtypes : str, type, dict, default None
            .. versionadded:: 0.24.0

            If a string or type, the data type to store all index levels. If
            a dictionary, a mapping of index level names and indices
            (zero-indexed) to specific data types.

            This mapping is applied only if `index=True`.

        Returns
        -------
        numpy.recarray
            NumPy ndarray with the DataFrame labels as fields and each row
            of the DataFrame as entries.

        See Also
        --------
        DataFrame.from_records: Convert structured or record ndarray
            to DataFrame.
        numpy.recarray: An ndarray that allows field access using
            attributes, analogous to typed columns in a
            spreadsheet.

        Examples
        --------
        >>> df = pd.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]},
        ...                   index=['a', 'b'])
        >>> df
           A     B
        a  1  0.50
        b  2  0.75
        >>> df.to_records()
        rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
                  dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])

        If the DataFrame index has no label then the recarray field name
        is set to 'index'. If the index has a label then this is used as the
        field name:

        >>> df.index = df.index.rename("I")
        >>> df.to_records()
        rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
                  dtype=[('I', 'O'), ('A', '<i8'), ('B', '<f8')])

        The index can be excluded from the record array:

        >>> df.to_records(index=False)
        rec.array([(1, 0.5 ), (2, 0.75)],
                  dtype=[('A', '<i8'), ('B', '<f8')])

        Data types can be specified for the columns:

        >>> df.to_records(column_dtypes={"A": "int32"})
        rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
                  dtype=[('I', 'O'), ('A', '<i4'), ('B', '<f8')])

        As well as for the index:

        >>> df.to_records(index_dtypes="<S2")
        rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
                  dtype=[('I', 'S2'), ('A', '<i8'), ('B', '<f8')])

        >>> index_dtypes = f"<S{df.index.str.len().max()}"
        >>> df.to_records(index_dtypes=index_dtypes)
        rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
                  dtype=[('I', 'S1'), ('A', '<i8'), ('B', '<f8')])
        cs$g|]}tj�jdd�|f��qS)N)r�rr�)r�r)r�r�r�r��sz(DataFrame.to_records.<locals>.<listcomp>rN�level_r�r�cSsg|]}t|��qSr�)r�)r�r�r�r�r�r��scs$g|]}tj�jdd�|f��qS)N)r�rr�)r�r)r�r�r�r��scSsg|]}t|��qSr�)r�)r��cr�r�r�r��sr>�columnzInvalid dtype z specified for � )r��formats)r�)r�r�rjr�rr�r�r
rr�r�r�r�r�rr[�chainrErr�r r�r�ZrecZ
fromarrays)r�r�Z
column_dtypesZindex_dtypesZix_valsr��countr�r�nr�Z	index_lenrjrZ
dtype_mappingr��element�msgr�)r�r��
to_records+sVW









zDataFrame.to_records)r��verify_integrityr�cCs,|dk	rt|�}t||||||d�}||�S)a�
        Create DataFrame from a list of arrays corresponding to the columns.

        Parameters
        ----------
        arrays : list-like of arrays
            Each array in the list corresponds to one column, in order.
        columns : list-like, Index
            The column names for the resulting DataFrame.
        index : list-like, Index
            The rows labels for the resulting DataFrame.
        dtype : dtype, optional
            Optional dtype to enforce for all arrays.
        verify_integrity : bool, default True
            Validate and homogenize all input. If set to False, it is assumed
            that all elements of `arrays` are actual arrays how they will be
            stored in a block (numpy ndarray or ExtensionArray), have the same
            length as and are aligned with the index, and that `columns` and
            `index` are ensured to be an Index object.

        Returns
        -------
        DataFrame
        N)r�rq)rSrp)r,r�r�r�r�rqr�r�r�r��_from_arrays�s!zDataFrame._from_arrays�fname�path)Zold_arg_nameZnew_arg_name�r�infer)rt�
convert_dates�write_index�	byteorder�
time_stamp�
data_label�variable_labels�version�convert_strl�compressionr�c
Cs�|dkrtd��|dkr6|	dk	r(td��dd	lm}n"|dkrLdd
lm}nddlm}i}|dksl|dkrt|	|d<|dks�|dkr�||d
<|||f|||||||
d�|��}
|
j�dS)a�
        Export DataFrame object to Stata dta format.

        Writes the DataFrame to a Stata dataset file.
        "dta" files contain a Stata dataset.

        Parameters
        ----------
        path : str, buffer or path object
            String, path object (pathlib.Path or py._path.local.LocalPath) or
            object implementing a binary write() function. If using a buffer
            then the buffer will not be automatically closed after the file
            data has been written.

            .. versionchanged:: 1.0.0

            Previously this was "fname"

        convert_dates : dict
            Dictionary mapping columns containing datetime types to stata
            internal format to use when writing the dates. Options are 'tc',
            'td', 'tm', 'tw', 'th', 'tq', 'ty'. Column can be either an integer
            or a name. Datetime columns that do not have a conversion type
            specified will be converted to 'tc'. Raises NotImplementedError if
            a datetime column has timezone information.
        write_index : bool
            Write the index to Stata dataset.
        byteorder : str
            Can be ">", "<", "little", or "big". default is `sys.byteorder`.
        time_stamp : datetime
            A datetime to use as file creation date.  Default is the current
            time.
        data_label : str, optional
            A label for the data set.  Must be 80 characters or smaller.
        variable_labels : dict
            Dictionary containing columns as keys and variable labels as
            values. Each label must be 80 characters or smaller.
        version : {114, 117, 118, 119, None}, default 114
            Version to use in the output dta file. Set to None to let pandas
            decide between 118 or 119 formats depending on the number of
            columns in the frame. Version 114 can be read by Stata 10 and
            later. Version 117 can be read by Stata 13 or later. Version 118
            is supported in Stata 14 and later. Version 119 is supported in
            Stata 15 and later. Version 114 limits string variables to 244
            characters or fewer while versions 117 and later allow strings
            with lengths up to 2,000,000 characters. Versions 118 and 119
            support Unicode characters, and version 119 supports more than
            32,767 variables.

            .. versionadded:: 0.23.0
            .. versionchanged:: 1.0.0

                Added support for formats 118 and 119.

        convert_strl : list, optional
            List of column names to convert to string columns to Stata StrL
            format. Only available if version is 117.  Storing strings in the
            StrL format can produce smaller dta files if strings have more than
            8 characters and values are repeated.

            .. versionadded:: 0.23.0

        compression : str or dict, default 'infer'
            For on-the-fly compression of the output dta. If string, specifies
            compression mode. If dict, value at key 'method' specifies
            compression mode. Compression mode must be one of {'infer', 'gzip',
            'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and
            `fname` is path-like, then detect compression from the following
            extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no
            compression). If dict and compression mode is one of {'zip',
            'gzip', 'bz2'}, or inferred as one of the above, other entries
            passed as additional compression options.

            .. versionadded:: 1.1.0

        Raises
        ------
        NotImplementedError
            * If datetimes contain timezone information
            * Column dtype is not representable in Stata
        ValueError
            * Columns listed in convert_dates are neither datetime64[ns]
              or datetime.datetime
            * Column listed in convert_dates is not in DataFrame
            * Categorical label contains more than 32,000 characters

        See Also
        --------
        read_stata : Import Stata data files.
        io.stata.StataWriter : Low-level writer for Stata data files.
        io.stata.StataWriter117 : Low-level writer for version 117 files.

        Examples
        --------
        >>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon',
        ...                               'parrot'],
        ...                    'speed': [350, 18, 361, 15]})
        >>> df.to_stata('animals.dta')  # doctest: +SKIP
        ru�u�v�wNz1Only formats 114, 117, 118 and 119 are supported.z#strl is not supported in format 114r)�StataWriter)�StataWriter117)�StataWriterUTF8r~r})rwryrzr{rxr|r)rur�r�r�N)r�Zpandas.io.statar�r�r��
write_file)r�rtrwrxryrzr{r|r}r~rZstatawriterr��writerr�r�r��to_statas4qzDataFrame.to_statacKsddlm}|||f|�dS)a�
        Write a DataFrame to the binary Feather format.

        Parameters
        ----------
        path : str
            String file path.
        **kwargs :
            Additional keywords passed to :func:`pyarrow.feather.write_feather`.
            Starting with pyarrow 0.17, this includes the `compression`,
            `compression_level`, `chunksize` and `version` keywords.

            .. versionadded:: 1.1.0
        r)�
to_featherN)Zpandas.io.feather_formatr�)r�rtr�r�r�r�r�r��szDataFrame.to_featherr�a�Examples
        --------
        >>> df = pd.DataFrame(
        ...     data={"animal_1": ["elk", "pig"], "animal_2": ["dog", "quetzal"]}
        ... )
        >>> print(df.to_markdown())
        |    | animal_1   | animal_2   |
        |---:|:-----------|:-----------|
        |  0 | elk        | dog        |
        |  1 | pig        | quetzal    |

        Output markdown with a tabulate option.

        >>> print(df.to_markdown(tablefmt="grid"))
        +----+------------+------------+
        |    | animal_1   | animal_2   |
        +====+============+============+
        |  0 | elk        | dog        |
        +----+------------+------------+
        |  1 | pig        | quetzal    |
        +----+------------+------------+
        )r��examples)r��moder�r�cKs�d|krtjdtdd�|jdd�|jdd�|jd|�td	�}|j|f|�}|dkr^|St||d
�\}}}}|dk	s~t�|j|�dS)NZ	showindexzn'showindex' is deprecated. Only 'index' will be used in a future version. Use 'index' to silence this warning.r�)r3�headersr+Ztablefmt�pipe�tabulate)r�)	r@rArD�
setdefaultr(r�r|�AssertionError�
writelines)r�r�r�r�r�r�r$r�r�r�r��to_markdown�s !
zDataFrame.to_markdown�auto�snappy)rt�enginerr��partition_colsr�cKs,ddlm}||||f|||d�|��dS)a

        Write a DataFrame to the binary parquet format.

        This function writes the dataframe as a `parquet file
        <https://parquet.apache.org/>`_. You can choose different parquet
        backends, and have the option of compression. See
        :ref:`the user guide <io.parquet>` for more details.

        Parameters
        ----------
        path : str or file-like object
            If a string, it will be used as Root Directory path
            when writing a partitioned dataset. By file-like object,
            we refer to objects with a write() method, such as a file handler
            (e.g. via builtin open function) or io.BytesIO. The engine
            fastparquet does not accept file-like objects.

            .. versionchanged:: 1.0.0

            Previously this was "fname"

        engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
            Parquet library to use. If 'auto', then the option
            ``io.parquet.engine`` is used. The default ``io.parquet.engine``
            behavior is to try 'pyarrow', falling back to 'fastparquet' if
            'pyarrow' is unavailable.
        compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
            Name of the compression to use. Use ``None`` for no compression.
        index : bool, default None
            If ``True``, include the dataframe's index(es) in the file output.
            If ``False``, they will not be written to the file.
            If ``None``, similar to ``True`` the dataframe's index(es)
            will be saved. However, instead of being saved as values,
            the RangeIndex will be stored as a range in the metadata so it
            doesn't require much space and is faster. Other indexes will
            be included as columns in the file output.

            .. versionadded:: 0.24.0

        partition_cols : list, optional, default None
            Column names by which to partition the dataset.
            Columns are partitioned in the order they are given.
            Must be None if path is not a string.

            .. versionadded:: 0.24.0

        **kwargs
            Additional arguments passed to the parquet library. See
            :ref:`pandas io <io.parquet>` for more details.

        See Also
        --------
        read_parquet : Read a parquet file.
        DataFrame.to_csv : Write a csv file.
        DataFrame.to_sql : Write to a sql table.
        DataFrame.to_hdf : Write to hdf.

        Notes
        -----
        This function requires either the `fastparquet
        <https://pypi.org/project/fastparquet>`_ or `pyarrow
        <https://arrow.apache.org/docs/python/>`_ library.

        Examples
        --------
        >>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
        >>> df.to_parquet('df.parquet.gzip',
        ...               compression='gzip')  # doctest: +SKIP
        >>> pd.read_parquet('df.parquet.gzip')  # doctest: +SKIP
           col1  col2
        0     1     3
        1     2     4

        If you want to get a buffer to the parquet content you can use a io.BytesIO
        object, as long as you don't use partition_cols, which creates multiple files.

        >>> import io
        >>> f = io.BytesIO()
        >>> df.to_parquet(f)
        >>> f.seek(0)
        0
        >>> content = f.read()
        r)�
to_parquet)rr�r�N)Zpandas.io.parquetr�)r�rtr�rr�r�r�r�r�r�r�r��s]zDataFrame.to_parquet�boolz,Whether to print column labels, default Truez&str or int, list or dict of int or strz�The minimum width of each column in CSS length units.  An int is assumed to be px units.

            .. versionadded:: 0.25.0
                Ability to use strcCs^|dk	r|tjkrtd��tj|||||||	||
||||||
||||d�}|j|||||d�S)a�
        Render a DataFrame as an HTML table.
        %(shared_params)s
        bold_rows : bool, default True
            Make the row labels bold in the output.
        classes : str or list or tuple, default None
            CSS class(es) to apply to the resulting html table.
        escape : bool, default True
            Convert the characters <, >, and & to HTML-safe sequences.
        notebook : {True, False}, default False
            Whether the generated HTML is for IPython Notebook.
        border : int
            A ``border=border`` attribute is included in the opening
            `<table>` tag. Default ``pd.options.display.html.border``.
        encoding : str, default "utf-8"
            Set character encoding.

            .. versionadded:: 1.0

        table_id : str, optional
            A css id is included in the opening `<table>` tag if specified.

            .. versionadded:: 0.23.0

        render_links : bool, default False
            Convert URLs to HTML links.

            .. versionadded:: 0.24.0
        %(returns)s
        See Also
        --------
        to_string : Convert DataFrame to a string.
        Nz#Invalid value for justify parameter)r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�)r��classesr��borderr�)r�Z_VALID_JUSTIFY_PARAMETERSr�r�r�)r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�G	s8EzDataFrame.to_htmlz and columnsa?max_cols : int, optional
                When to switch from the verbose to the truncated output. If the
                DataFrame has more than `max_cols` columns, the truncated output
                is used. By default, the setting in
                ``pandas.options.display.max_info_columns`` is used.
            a�

            >>> int_values = [1, 2, 3, 4, 5]
            >>> text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
            >>> float_values = [0.0, 0.25, 0.5, 0.75, 1.0]
            >>> df = pd.DataFrame({"int_col": int_values, "text_col": text_values,
            ...                   "float_col": float_values})
            >>> df
                int_col text_col  float_col
            0        1    alpha       0.00
            1        2     beta       0.25
            2        3    gamma       0.50
            3        4    delta       0.75
            4        5  epsilon       1.00

            Prints information of all columns:

            >>> df.info(verbose=True)
            <class 'pandas.core.frame.DataFrame'>
            RangeIndex: 5 entries, 0 to 4
            Data columns (total 3 columns):
             #   Column     Non-Null Count  Dtype
            ---  ------     --------------  -----
             0   int_col    5 non-null      int64
             1   text_col   5 non-null      object
             2   float_col  5 non-null      float64
            dtypes: float64(1), int64(1), object(1)
            memory usage: 248.0+ bytes

            Prints a summary of columns count and its dtypes but not per column
            information:

            >>> df.info(verbose=False)
            <class 'pandas.core.frame.DataFrame'>
            RangeIndex: 5 entries, 0 to 4
            Columns: 3 entries, int_col to float_col
            dtypes: float64(1), int64(1), object(1)
            memory usage: 248.0+ bytes

            Pipe output of DataFrame.info to buffer instead of sys.stdout, get
            buffer content and writes to a text file:

            >>> import io
            >>> buffer = io.StringIO()
            >>> df.info(buf=buffer)
            >>> s = buffer.getvalue()
            >>> with open("df_info.txt", "w",
            ...           encoding="utf-8") as f:  # doctest: +SKIP
            ...     f.write(s)
            260

            The `memory_usage` parameter allows deep introspection mode, specially
            useful for big DataFrames and fine-tune memory optimization:

            >>> random_strings_array = np.random.choice(['a', 'b', 'c'], 10 ** 6)
            >>> df = pd.DataFrame({
            ...     'column_1': np.random.choice(['a', 'b', 'c'], 10 ** 6),
            ...     'column_2': np.random.choice(['a', 'b', 'c'], 10 ** 6),
            ...     'column_3': np.random.choice(['a', 'b', 'c'], 10 ** 6)
            ... })
            >>> df.info()
            <class 'pandas.core.frame.DataFrame'>
            RangeIndex: 1000000 entries, 0 to 999999
            Data columns (total 3 columns):
             #   Column    Non-Null Count    Dtype
            ---  ------    --------------    -----
             0   column_1  1000000 non-null  object
             1   column_2  1000000 non-null  object
             2   column_3  1000000 non-null  object
            dtypes: object(3)
            memory usage: 22.9+ MB

            >>> df.info(memory_usage='deep')
            <class 'pandas.core.frame.DataFrame'>
            RangeIndex: 1000000 entries, 0 to 999999
            Data columns (total 3 columns):
             #   Column    Non-Null Count    Dtype
            ---  ------    --------------    -----
             0   column_1  1000000 non-null  object
             1   column_2  1000000 non-null  object
             2   column_3  1000000 non-null  object
            dtypes: object(3)
            memory usage: 188.8 MBz�
            DataFrame.describe: Generate descriptive statistics of DataFrame
                columns.
            DataFrame.memory_usage: Memory usage of DataFrame columns.)r�Ztype_subZmax_cols_subZexamples_sub�see_also_sub)�verboser�r��memory_usage�null_countsr�cCst||||||�j�S)N)rr�)r�r�r�r�r�r�r�r�r�r��	sozDataFrame.infocsJ|j�fdd�|j�D�|jd�}|rF|j|jj�d�dgd�j|�}|S)a
        Return the memory usage of each column in bytes.

        The memory usage can optionally include the contribution of
        the index and elements of `object` dtype.

        This value is displayed in `DataFrame.info` by default. This can be
        suppressed by setting ``pandas.options.display.memory_usage`` to False.

        Parameters
        ----------
        index : bool, default True
            Specifies whether to include the memory usage of the DataFrame's
            index in returned Series. If ``index=True``, the memory usage of
            the index is the first item in the output.
        deep : bool, default False
            If True, introspect the data deeply by interrogating
            `object` dtypes for system-level memory consumption, and include
            it in the returned values.

        Returns
        -------
        Series
            A Series whose index is the original column names and whose values
            is the memory usage of each column in bytes.

        See Also
        --------
        numpy.ndarray.nbytes : Total bytes consumed by the elements of an
            ndarray.
        Series.memory_usage : Bytes consumed by a Series.
        Categorical : Memory-efficient array for string values with
            many repeated values.
        DataFrame.info : Concise summary of a DataFrame.

        Examples
        --------
        >>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool']
        >>> data = dict([(t, np.ones(shape=5000).astype(t))
        ...              for t in dtypes])
        >>> df = pd.DataFrame(data)
        >>> df.head()
           int64  float64            complex128  object  bool
        0      1      1.0    1.000000+0.000000j       1  True
        1      1      1.0    1.000000+0.000000j       1  True
        2      1      1.0    1.000000+0.000000j       1  True
        3      1      1.0    1.000000+0.000000j       1  True
        4      1      1.0    1.000000+0.000000j       1  True

        >>> df.memory_usage()
        Index           128
        int64         40000
        float64       40000
        complex128    80000
        object        40000
        bool           5000
        dtype: int64

        >>> df.memory_usage(index=False)
        int64         40000
        float64       40000
        complex128    80000
        object        40000
        bool           5000
        dtype: int64

        The memory footprint of `object` dtype columns is ignored by default:

        >>> df.memory_usage(deep=True)
        Index            128
        int64          40000
        float64        40000
        complex128     80000
        object        160000
        bool            5000
        dtype: int64

        Use a Categorical for efficient storage of an object-dtype column with
        many repeated values.

        >>> df['object'].astype('category').memory_usage(deep=True)
        5216
        csg|]\}}|jd�d��qS)F)r��deep)r�)r�rWrg)r�r�r�r�v
sz*DataFrame.memory_usage.<locals>.<listcomp>)r�)r�rf)r	rr�r�r�r)r�r�r�r$r�)r�r�r�!
sT
zDataFrame.memory_usage)r�cs�tj|t��t|j�}|jrv|rvt|d�rv|d��j��|j}��fdd�|D�}|j	tt
|j|��|jd�}n(|jj
}|r�|j�}|j	||j|jd�}|j|dd�S)a�

        Transpose index and columns.

        Reflect the DataFrame over its main diagonal by writing rows as columns
        and vice-versa. The property :attr:`.T` is an accessor to the method
        :meth:`transpose`.

        Parameters
        ----------
        *args : tuple, optional
            Accepted for compatibility with NumPy.
        copy : bool, default False
            Whether to copy the data after transposing, even for DataFrames
            with a single dtype.

            Note that a copy is always required for mixed dtype DataFrames,
            or for DataFrames with any extension types.

        Returns
        -------
        DataFrame
            The transposed DataFrame.

        See Also
        --------
        numpy.transpose : Permute the dimensions of a given array.

        Notes
        -----
        Transposing a DataFrame with mixed dtypes will result in a homogeneous
        DataFrame with the `object` dtype. In such a case, a copy of the data
        is always made.

        Examples
        --------
        **Square DataFrame with homogeneous dtype**

        >>> d1 = {'col1': [1, 2], 'col2': [3, 4]}
        >>> df1 = pd.DataFrame(data=d1)
        >>> df1
           col1  col2
        0     1     3
        1     2     4

        >>> df1_transposed = df1.T # or df1.transpose()
        >>> df1_transposed
              0  1
        col1  1  2
        col2  3  4

        When the dtype is homogeneous in the original DataFrame, we get a
        transposed DataFrame with the same dtype:

        >>> df1.dtypes
        col1    int64
        col2    int64
        dtype: object
        >>> df1_transposed.dtypes
        0    int64
        1    int64
        dtype: object

        **Non-square DataFrame with mixed dtypes**

        >>> d2 = {'name': ['Alice', 'Bob'],
        ...       'score': [9.5, 8],
        ...       'employed': [False, True],
        ...       'kids': [0, 0]}
        >>> df2 = pd.DataFrame(data=d2)
        >>> df2
            name  score  employed  kids
        0  Alice    9.5     False     0
        1    Bob    8.0      True     0

        >>> df2_transposed = df2.T # or df2.transpose()
        >>> df2_transposed
                      0     1
        name      Alice   Bob
        score       9.5     8
        employed  False  True
        kids          0     0

        When the DataFrame has mixed dtypes, we get a transposed DataFrame with
        the `object` dtype:

        >>> df2.dtypes
        name         object
        score       float64
        employed       bool
        kids          int64
        dtype: object
        >>> df2_transposed.dtypes
        0    object
        1    object
        dtype: object
        rcsg|]}�j|�d��qS))r�)Z_from_sequence)r�r>)�arr_typer�r�r�r��
sz'DataFrame.transpose.<locals>.<listcomp>)r�)r�r�r')�method)�nvZvalidate_transposer�r��dtypesr�rGZconstruct_array_typer�r�r
r�r�r&r��__finalize__)r�r�r�r�r��
new_valuesr$r�)r�r�r�r'
sa
zDataFrame.transposecCs|j�S)N)r')r�r�r�r�r&�
szDataFrame.Tr)rr�cCs�|dkrX|jj|�}t|tj�o(|jdk}|j||j|j||j	d�}|j
||d�|S|j|}|jj|�}|j||�}|j
||�|SdS)z�
        Parameters
        ----------
        i : int
        axis : int

        Notes
        -----
        If slice passed, the resulting data will be a view.
        rN)r�r�r�)r�)r�Zfast_xsr�r�r�rer	r�r�r��_set_is_copy�iget�_box_col_valuesZ_set_as_cached)r�rr�r�r�r$�labelr�r�r�r�rs

zDataFrame._ixs)rr�cCs|jj|�S)zo
        Get the values of the i'th column (ndarray or ExtensionArray, as stored
        in the Block)
        )r�Ziget_values)r�rr�r�r��_get_column_array&szDataFrame._get_column_arrayccs(x"tt|j��D]}|j|�VqWdS)z�
        Iterate over the arrays of all columns in order.
        This returns the values as stored in the Block (ndarray or ExtensionArray).
        N)r�r�r�r�)r�rr�r�r��_iter_column_arrays-szDataFrame._iter_column_arrayscCs�tj|�}tj||�}t|�rP|jjrP||jkrP|jjdkrF|j|�S|j	|�St
||�}|dk	r�t|tj
�r�tj|jtjdd�t|��}|j|dd�St|t�r�|j|�Stj|�r�|j|�St|t�p�t|�}|�r|jjdkr�|j|�S|jj|�}t|��r6|g}n(t|��r t|�}|jj|ddd�d}t|dd�t k�rVtj|�d}|j!|dd�}|�r�|j"ddk�r�t|jt#��r�||}|S)	Nr�F)r�r)r�T)r��
raise_missingr�)$rZitem_from_zerodimr;�apply_if_callablerIr�r�nlevels�_getitem_multilevelrrnr�r�r�Zmaybe_indices_to_slice�astypeZintpr�Z_slicer��where�is_bool_indexer�_getitem_bool_array�tuplerMrSrJrLr��loc�_get_listlike_indexerr�r��_take_with_is_copyr�rj)r��key�indexerZ
is_single_keyr�r�r�r��__getitem__5sD










 zDataFrame.__getitem__cCs�t|t�r,|jj|j�r,tjdtdd�n2t|�t|j�kr^tdt|��dt|j��d���t	|j|�}|j
�d}|j|dd�S)	Nz>Boolean Series key will be reindexed to match DataFrame index.�)r3zItem wrong length z instead of r�r)r�)r�rzr��equalsr@rArBr�r�rm�nonzeror�)r�r�r�r�r�r�r�rs
zDataFrame._getitem_bool_arraycCs�|jj|�}t|ttjf�r�|j|}t||�}|jrJ|j|d�}||_n.|j	dd�|f}|j
||j|d�}|j|�}t
|j�dkr�|jd}t|t�r�|d}|dkr�|d}t|t�r�|j||j|d�}|j|�|S|j|dd�SdS)N)r�)r�r�r�rr�)r�r�)r�)r�rSr��slicer�r�rkr�rr�r�r�r�r�r�rzr	r�r)r�r�r��new_columnsZresult_columnsr$r��topr�r�r�r��s.






zDataFrame._getitem_multilevel)�takeablec
Cs�|r|j|dd�}|j|S|j|�}|jj}y|j|�}|j|Stk
rd|jjdkr`�YnX|jj|�}|jj|�}|j	||dd�S)a
        Quickly retrieve single value at passed column and index.

        Parameters
        ----------
        index : row label
        col : column label
        takeable : interpret the index/col as indexers, default False

        Returns
        -------
        scalar
        r�)r�T)r�)
rrrr��_enginerSr`r�r��
_get_value)r�r�rWr�r5r�r�r�r�r�r��s



zDataFrame._get_valuecCs�tj||�}t||�}|dk	r*|j||�St|t�sDt|dd�dkrR|j||�n.t|tt	j
ttf�rt|j
||�n|j||�dS)Nr�r�)r;r�rn�_setitem_slicer�r�r��_setitem_framerzr�r�r�rf�_setitem_array�	_set_item)r�r�r�r�r�r�r��__setitem__�s
zDataFrame.__setitem__)r�cCs|j�|jj||�dS)N)�_check_setitem_copyr��_setitem_with_indexer)r�r�r�r�r�r�r��szDataFrame._setitem_slicecCstj|�rlt|�t|j�kr<tdt|��dt|j��d���t|j|�}|j�d}|j�|jj	||�n�t
|t�r�t|j�t|�kr�td��xjt
||j�D]\}}||||<q�WnD|jj|dd�|jj|ddd	�d}|j�|jj	td�|f|�dS)
NzItem wrong length z instead of �!rz"Columns must be same length as keyr�)r�F)r�r�)r;r�r�r�r�rmr�r�r�r�r�r�r�r
r�Z_ensure_listlike_indexerr�r�)r�r�r�r�Zk1Zk2r�r�r�r��s&


zDataFrame._setitem_arraycCstt|tj�r2|j|jkr td��|j|f|j��}|jrLt|j	�rLt
d��|j|�|j�|j
||dd�dS)Nz,Array conditional must be same shape as selfz;Must pass DataFrame or 2-d ndarray with boolean values onlyT)�inplace)r�r�r�r�r�r�Z_construct_axes_dict�sizerBr�r�Z_check_inplace_settingr�Z_where)r�r�r�r�r�r�r�s
zDataFrame._setitem_frame)r�cCs<|j|�|j||dd�}tj|||�t|�r8|j�dS)NF)�	broadcast)�_ensure_valid_index�_sanitize_columnrc�
_iset_itemr�r�)r�r�r�r�r�r�r�s

zDataFrame._iset_itemcCs8|j|�|j||�}tj|||�t|�r4|j�dS)a8
        Add series to DataFrame in specified column.

        If series is a numpy-array (not a Series/TimeSeries), it must be the
        same length as the DataFrames index or an error will be thrown.

        Series/TimeSeries will be conformed to the DataFrames index to
        ensure homogeneity.
        N)r�r�rcr�r�r�)r�r�r�r�r�r�r�%s


zDataFrame._set_itemcCs�y`|dkr,|j|dd�}|j||dd�dS|j|�}|jj}|j|�}t|j|�||j|<WnHt	t
fk
r�|r�||j||f<n||j||f<|j
j|d�YnXdS)z�
        Put single value at passed column and index.

        Parameters
        ----------
        index : row label
        col : column label
        value : scalar
        takeable : interpret the index/col as indexers, default False
        Tr�)r�)r�N)r�
_set_valuerr�r�rSr>r�rr`r�r�r�r�pop)r�r�rWr�r�r5r�r�r�r�r�r�9s

zDataFrame._set_valuecCs�t|j�r�t|�r�t|�r�yt|�}Wn2tttfk
rZ}ztd�|�WYdd}~XnX|jj�}|jjdk	r||jj|_|j	j
|dtjd�|_	dS)zn
        Ensure that if we don't have an index, that we can create one from the
        passed value.
        zYCannot set a frame with no defined index and a value that cannot be converted to a SeriesNr�)r�r�)
r�r�rMrzr�r�r�r�r�r�Zreindex_axisr��nan)r�r�r�Z
index_copyr�r�r�r�Zs

zDataFrame._ensure_valid_index)r�r�cCs"|j|}|j}|||j|dd�S)z4
        Provide boxed values for a column.
        T)r�r��fastpath)r�r	r�)r�r�r�r�r�r�r�r�r�ps
zDataFrame._box_col_valuescKs�t|d�}t|t�s,dt|��d�}t|��|jdd�d|d<d|d<|j|f|�}y|j|}Wntk
r�||}YnX|r�|j|�n|SdS)	a�
        Query the columns of a DataFrame with a boolean expression.

        Parameters
        ----------
        expr : str
            The query string to evaluate.

            You can refer to variables
            in the environment by prefixing them with an '@' character like
            ``@a + b``.

            You can refer to column names that contain spaces or operators by
            surrounding them in backticks. This way you can also escape
            names that start with a digit, or those that  are a Python keyword.
            Basically when it is not valid Python identifier. See notes down
            for more details.

            For example, if one of your columns is called ``a a`` and you want
            to sum it with ``b``, your query should be ```a a` + b``.

            .. versionadded:: 0.25.0
                Backtick quoting introduced.

            .. versionadded:: 1.0.0
                Expanding functionality of backtick quoting for more than only spaces.

        inplace : bool
            Whether the query should modify the data in place or return
            a modified copy.
        **kwargs
            See the documentation for :func:`eval` for complete details
            on the keyword arguments accepted by :meth:`DataFrame.query`.

        Returns
        -------
        DataFrame
            DataFrame resulting from the provided query expression.

        See Also
        --------
        eval : Evaluate a string describing operations on
            DataFrame columns.
        DataFrame.eval : Evaluate a string describing operations on
            DataFrame columns.

        Notes
        -----
        The result of the evaluation of this expression is first passed to
        :attr:`DataFrame.loc` and if that fails because of a
        multidimensional key (e.g., a DataFrame) then the result will be passed
        to :meth:`DataFrame.__getitem__`.

        This method uses the top-level :func:`eval` function to
        evaluate the passed query.

        The :meth:`~pandas.DataFrame.query` method uses a slightly
        modified Python syntax by default. For example, the ``&`` and ``|``
        (bitwise) operators have the precedence of their boolean cousins,
        :keyword:`and` and :keyword:`or`. This *is* syntactically valid Python,
        however the semantics are different.

        You can change the semantics of the expression by passing the keyword
        argument ``parser='python'``. This enforces the same semantics as
        evaluation in Python space. Likewise, you can pass ``engine='python'``
        to evaluate an expression using Python itself as a backend. This is not
        recommended as it is inefficient compared to using ``numexpr`` as the
        engine.

        The :attr:`DataFrame.index` and
        :attr:`DataFrame.columns` attributes of the
        :class:`~pandas.DataFrame` instance are placed in the query namespace
        by default, which allows you to treat both the index and columns of the
        frame as a column in the frame.
        The identifier ``index`` is used for the frame index; you can also
        use the name of the index to identify it in a query. Please note that
        Python keywords may not be used as identifiers.

        For further details and examples see the ``query`` documentation in
        :ref:`indexing <indexing.query>`.

        *Backtick quoted variables*

        Backtick quoted variables are parsed as literal Python code and
        are converted internally to a Python valid identifier.
        This can lead to the following problems.

        During parsing a number of disallowed characters inside the backtick
        quoted string are replaced by strings that are allowed as a Python identifier.
        These characters include all operators in Python, the space character, the
        question mark, the exclamation mark, the dollar sign, and the euro sign.
        For other characters that fall outside the ASCII range (U+0001..U+007F)
        and those that are not further specified in PEP 3131,
        the query parser will raise an error.
        This excludes whitespace different than the space character,
        but also the hashtag (as it is used for comments) and the backtick
        itself (backtick can also not be escaped).

        In a special case, quotes that make a pair around a backtick can
        confuse the parser.
        For example, ```it's` > `that's``` will raise an error,
        as it forms a quoted string (``'s > `that'``) with a backtick inside.

        See also the Python documentation about lexical analysis
        (https://docs.python.org/3/reference/lexical_analysis.html)
        in combination with the source code in :mod:`pandas.core.computation.parsing`.

        Examples
        --------
        >>> df = pd.DataFrame({'A': range(1, 6),
        ...                    'B': range(10, 0, -2),
        ...                    'C C': range(10, 5, -1)})
        >>> df
           A   B  C C
        0  1  10   10
        1  2   8    9
        2  3   6    8
        3  4   4    7
        4  5   2    6
        >>> df.query('A > B')
           A  B  C C
        4  5  2    6

        The previous expression is equivalent to

        >>> df[df.A > df.B]
           A  B  C C
        4  5  2    6

        For columns with spaces in their name, you can use backtick quoting.

        >>> df.query('B == `C C`')
           A   B  C C
        0  1  10   10

        The previous expression is equivalent to

        >>> df[df.B == df['C C']]
           A   B  C C
        0  1  10   10
        r�z'expr must be a string to be evaluated, z given�levelrr�N�target)	r0r�r�r r�r��evalr��_update_inplace)r��exprr�r�ro�resr$r�r�r��query}s

zDataFrame.querycKs�ddlm}t|d�}|jdd�}|jdd�d|d<|dkrV|j�}|j�}||f}d|krf||d<|jdf�t|�|d<||fd|i|��S)	a�

        Evaluate a string describing operations on DataFrame columns.

        Operates on columns only, not specific rows or elements.  This allows
        `eval` to run arbitrary code, which can make you vulnerable to code
        injection if you pass user input to this function.

        Parameters
        ----------
        expr : str
            The expression string to evaluate.
        inplace : bool, default False
            If the expression contains an assignment, whether to perform the
            operation inplace and mutate the existing DataFrame. Otherwise,
            a new DataFrame is returned.
        **kwargs
            See the documentation for :func:`eval` for complete details
            on the keyword arguments accepted by
            :meth:`~pandas.DataFrame.query`.

        Returns
        -------
        ndarray, scalar, or pandas object
            The result of the evaluation.

        See Also
        --------
        DataFrame.query : Evaluates a boolean expression to query the columns
            of a frame.
        DataFrame.assign : Can evaluate an expression or function to create new
            values for a column.
        eval : Evaluate a Python expression as a string using various
            backends.

        Notes
        -----
        For more details see the API documentation for :func:`~eval`.
        For detailed examples see :ref:`enhancing performance with eval
        <enhancingperf.eval>`.

        Examples
        --------
        >>> df = pd.DataFrame({'A': range(1, 6), 'B': range(10, 0, -2)})
        >>> df
           A   B
        0  1  10
        1  2   8
        2  3   6
        3  4   4
        4  5   2
        >>> df.eval('A + B')
        0    11
        1    10
        2     9
        3     8
        4     7
        dtype: int64

        Assignment is allowed though by default the original DataFrame is not
        modified.

        >>> df.eval('C = A + B')
           A   B   C
        0  1  10  11
        1  2   8  10
        2  3   6   9
        3  4   4   8
        4  5   2   7
        >>> df
           A   B
        0  1  10
        1  2   8
        2  3   6
        3  4   4
        4  5   2

        Use ``inplace=True`` to modify the original DataFrame.

        >>> df.eval('C = A + B', inplace=True)
        >>> df
           A   B   C
        0  1  10  11
        1  2   8  10
        2  3   6   9
        3  4   4   8
        4  5   2   7

        Multiple columns can be assigned to using multi-line expressions:

        >>> df.eval(
        ...     '''
        ... C = A + B
        ... D = A - B
        ... '''
        ... )
           A   B   C  D
        0  1  10  11 -9
        1  2   8  10 -6
        2  3   6   9 -3
        3  4   4   8  0
        4  5   2   7  3
        r)r�r��	resolversNr�r�r�)Zpandas.core.computation.evalr�r0r�Z_get_index_resolversZ_get_cleaned_column_resolvers�getr�)r�r�r�r�Z_evalr�Zindex_resolversZcolumn_resolversr�r�r�r�
sg
zDataFrame.evalc
Cs<t|�s|dk	r|fnf}t|�s4|dk	r0|fnf}t|�t|�f}t|�sTtd��tdd�|D��}tdd�|D��}x||fD]}t|�q�W|j|�s�td||@����tj|jdd�}t	t
tjtt
d	�d
d�}|j
j�}|�r|||�}||j
j|�M}|�r(|||�}	||j
j|	�M}|jdd�|jfS)a�	
        Return a subset of the DataFrame's columns based on the column dtypes.

        Parameters
        ----------
        include, exclude : scalar or list-like
            A selection of dtypes or strings to be included/excluded. At least
            one of these parameters must be supplied.

        Returns
        -------
        DataFrame
            The subset of the frame including the dtypes in ``include`` and
            excluding the dtypes in ``exclude``.

        Raises
        ------
        ValueError
            * If both of ``include`` and ``exclude`` are empty
            * If ``include`` and ``exclude`` have overlapping elements
            * If any kind of string dtype is passed in.

        See Also
        --------
        DataFrame.dtypes: Return Series with the data type of each column.

        Notes
        -----
        * To select all *numeric* types, use ``np.number`` or ``'number'``
        * To select strings you must use the ``object`` dtype, but note that
          this will return *all* object dtype columns
        * See the `numpy dtype hierarchy
          <https://numpy.org/doc/stable/reference/arrays.scalars.html>`__
        * To select datetimes, use ``np.datetime64``, ``'datetime'`` or
          ``'datetime64'``
        * To select timedeltas, use ``np.timedelta64``, ``'timedelta'`` or
          ``'timedelta64'``
        * To select Pandas categorical dtypes, use ``'category'``
        * To select Pandas datetimetz dtypes, use ``'datetimetz'`` (new in
          0.20.0) or ``'datetime64[ns, tz]'``

        Examples
        --------
        >>> df = pd.DataFrame({'a': [1, 2] * 3,
        ...                    'b': [True, False] * 3,
        ...                    'c': [1.0, 2.0] * 3})
        >>> df
                a      b  c
        0       1   True  1.0
        1       2  False  2.0
        2       1   True  1.0
        3       2  False  2.0
        4       1   True  1.0
        5       2  False  2.0

        >>> df.select_dtypes(include='bool')
           b
        0  True
        1  False
        2  True
        3  False
        4  True
        5  False

        >>> df.select_dtypes(include=['float64'])
           c
        0  1.0
        1  2.0
        2  1.0
        3  2.0
        4  1.0
        5  2.0

        >>> df.select_dtypes(exclude=['int64'])
               b    c
        0   True  1.0
        1  False  2.0
        2   True  1.0
        3  False  2.0
        4   True  1.0
        5  False  2.0
        Nz3at least one of include or exclude must be nonemptycss|]}t|�VqdS)N)rA)r�rVr�r�r�r��
sz*DataFrame.select_dtypes.<locals>.<genexpr>css|]}t|�VqdS)N)rA)r�rVr�r�r�r��
szinclude and exclude overlap on r�T)�
dtypes_set�
unique_dtypesr�cs�fdd�|D�}|S)Ncs g|]}t|jt���r|�qSr�)�
issubclassr r�)r�Zunique_dtype)r�r�r�r�szZDataFrame.select_dtypes.<locals>.extract_unique_dtypes_from_dtypes_set.<locals>.<listcomp>r�)r�r�Zextracted_dtypesr�)r�r��%extract_unique_dtypes_from_dtypes_sets
zFDataFrame.select_dtypes.<locals>.extract_unique_dtypes_from_dtypes_set)rM�	frozensetr�r�r7�
isdisjointr��fullr�r
rr�rr��unique�isinr�r�)
r��includercZ	selectionr�Z
keep_theser�r�Zincluded_dtypesZexcluded_dtypesr�r�r��
select_dtypes�
s4S


zDataFrame.select_dtypescCs2|j|�|j||dd�}|jj||||d�dS)a�
        Insert column into DataFrame at specified location.

        Raises a ValueError if `column` is already contained in the DataFrame,
        unless `allow_duplicates` is set to True.

        Parameters
        ----------
        loc : int
            Insertion index. Must verify 0 <= loc <= len(columns).
        column : str, number, or hashable object
            Label of the inserted column.
        value : int, Series, or array-like
        allow_duplicates : bool, optional
        F)r�)�allow_duplicatesN)r�r�r�r)r�r�rhr�r�r�r�r�rs
zDataFrame.insertcKs2|j�}x$|j�D]\}}tj||�||<qW|S)a		
        Assign new columns to a DataFrame.

        Returns a new object with all original columns in addition to new ones.
        Existing columns that are re-assigned will be overwritten.

        Parameters
        ----------
        **kwargs : dict of {str: callable or Series}
            The column names are keywords. If the values are
            callable, they are computed on the DataFrame and
            assigned to the new columns. The callable must not
            change input DataFrame (though pandas doesn't check it).
            If the values are not callable, (e.g. a Series, scalar, or array),
            they are simply assigned.

        Returns
        -------
        DataFrame
            A new DataFrame with the new columns in addition to
            all the existing columns.

        Notes
        -----
        Assigning multiple columns within the same ``assign`` is possible.
        Later items in '\*\*kwargs' may refer to newly created or modified
        columns in 'df'; items are computed and assigned into 'df' in order.

        .. versionchanged:: 0.23.0

           Keyword argument order is maintained.

        Examples
        --------
        >>> df = pd.DataFrame({'temp_c': [17.0, 25.0]},
        ...                   index=['Portland', 'Berkeley'])
        >>> df
                  temp_c
        Portland    17.0
        Berkeley    25.0

        Where the value is a callable, evaluated on `df`:

        >>> df.assign(temp_f=lambda x: x.temp_c * 9 / 5 + 32)
                  temp_c  temp_f
        Portland    17.0    62.6
        Berkeley    25.0    77.0

        Alternatively, the same behavior can be achieved by directly
        referencing an existing Series or sequence:

        >>> df.assign(temp_f=df['temp_c'] * 9 / 5 + 32)
                  temp_c  temp_f
        Portland    17.0    62.6
        Berkeley    25.0    77.0

        You can create multiple columns within the same assign where one
        of the columns depends on another one defined within the same assign:

        >>> df.assign(temp_f=lambda x: x['temp_c'] * 9 / 5 + 32,
        ...           temp_k=lambda x: (x['temp_f'] +  459.67) * 5 / 9)
                  temp_c  temp_f  temp_k
        Portland    17.0    62.6  290.15
        Berkeley    25.0    77.0  298.15
        )r�rr;r�)r�r�r�r�rr�r�r��assign.sBzDataFrame.assignc
s>�fdd�}t|t�r"||�}�n�t|t�r�t�jt�r�|�jkr��jj|�}t|tttjt	f�r�t
�j||�}t|�r�|j|j�r�|j
|dd�}||�j}�nt|t�r�|j�}t|�j�}n�t|t	�s�t|��rvt|�j�}t|tjt	f��s&t|t��rt|�dk�rt|�}n
tj|�}n:|jdk�r>|j�j}n"t|t	��rX|jdd�}n|j�}t|j��r�t|�}nHt|dd	�\}}t|��r�t|t�j�|�}ntt�j�|�}t ||�}t|��r�|S|�r.|�jk�r.|jdk�r.�jj!�st�jt��r.�|}	t|	t��r.tj"|t|	j�df�}tj#tj$|��S)
a�
        Ensures new columns (which go into the BlockManager as new blocks) are
        always copied and converted into an array.

        Parameters
        ----------
        key : object
        value : scalar, Series, or array-like
        broadcast : bool, default True
            If ``key`` matches multiple duplicate column names in the
            DataFrame, this parameter indicates whether ``value`` should be
            tiled so that the returned array contains a (duplicated) column for
            each occurrence of the key. If False, ``value`` will not be tiled.

        Returns
        -------
        numpy.ndarray
        csv|jj�j�st�j�r&|jj�}nLy|j�j�j}Wn8tk
rp}z|jjsV|�td�|�WYdd}~XnX|S)Nz6incompatible index of inserted column with frame index)	r�r�r�rr�rr�rr�)r�r�)r�r�r��	reindexer�sz-DataFrame._sanitize_column.<locals>.reindexerr�)r�rr�T)r�)rS)%r�rzr�r�rjrSr�r�r�rfrkr�r�rr&r_r�rwr�rQr�r9r;Zasarray_tuplesafer�rOr�r;r6rGr4r2r8rZtileZ
atleast_2dr)
r�r�r�r�r�r��colsZinfer_dtyper�Zexisting_piecer�)r�r�r�vsR







zDataFrame._sanitize_columncs�fdd�t�j�D�S)Ncs,i|]$\}}t�jj|��j|dd�|�qS)T)r�r�r�)rzr�r�r�)r��idx�item)r�r�r�r��sz%DataFrame._series.<locals>.<dictcomp>)rr�)r�r�)r�r��_series�s
zDataFrame._seriesc
Cst|�}|t|�krtd��|jjo*|jjs4td��d}|jsH||kr�|j}|jj|�}|jj|�}|d	kj�rzt	d��|d
kj�r�t	d��|t|j�|}|j
|}	n>tj|dd�}	x.t
t||��D]\}
\}}|j||�|	|
<q�Wt|	��r�tj|	�}	|	S)a�
        Label-based "fancy indexing" function for DataFrame.

        Given equal-length arrays of row and column labels, return an
        array of the values corresponding to each (row, col) pair.

        Parameters
        ----------
        row_labels : sequence
            The row labels to use for lookup.
        col_labels : sequence
            The column labels to use for lookup.

        Returns
        -------
        numpy.ndarray
            The found values.
        z/Row labels must have same size as column labelsz2DataFrame.lookup requires unique index and columnsi�r�z$One or more row labels was not foundz'One or more column labels was not found�O)r����r�)r�r�r�rr�r�r��get_indexerr�r`Zflatr��emptyrr
r�rOr�maybe_convert_objects)
r�Z
row_labelsZ
col_labelsrm�threshr�ZridxZcidxZ
flat_indexr$rr4rgr�r�r��lookup�s,

zDataFrame.lookupcCsT|}|d}	|	dk	r*|j|	||||||�}|d}
|
dk	rP|j|
||||||�}|S)Nr�r�)�_reindex_columns�_reindex_index)r�r�r��limit�	tolerancer�r�r��framer�r�r�r�r��
_reindex_axesszDataFrame._reindex_axesc	Cs4|jj|||||d�\}}|jd||gi||dd�S)N)r�r�r�r�rF)r�r��
allow_dups)r�r�_reindex_with_indexers)	r��	new_indexr�r�r�r�r�r�r�r�r�r�r�0s

zDataFrame._reindex_indexc	Cs4|jj|||||d�\}}|jd||gi||dd�S)N)r�r�r�r�r�F)r�r�r�)r�rr�)	r�r�r�r�r�r�r�r�r�r�r�r�r�Ds

zDataFrame._reindex_columnsc
Cs�|jj|d�\}}|jj|d�\}}|dk	rb|dk	rb||f}tj|j||d�}	|j|	||d�S|j||g||gd�||d�SdS)z:
        We are guaranteed non-Nones in the axes.
        r�r�N)r�)r�r�)rr�)r�r�)r�rr�rWZ
take_2d_multir�r�r�)
r�r�r�r�rZrow_indexerr�Zcol_indexerr�r�r�r�r��_reindex_multiXszDataFrame._reindex_multi�outercs t�j|||||||||	|
d�
S)N)	�joinr�r�r�r�r�r��	fill_axis�broadcast_axis)�super�align)r�r!rr�r�r�r�r�r�rr)�	__class__r�r�rlszDataFrame.aligna`
        Examples
        --------
        >>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})

        Change the row labels.

        >>> df.set_axis(['a', 'b', 'c'], axis='index')
           A  B
        a  1  4
        b  2  5
        c  3  6

        Change the column labels.

        >>> df.set_axis(['I', 'II'], axis='columns')
           I  II
        0  1   4
        1  2   5
        2  3   6

        Now, update the labels inplace.

        >>> df.set_axis(['i', 'ii'], axis='columns', inplace=True)
        >>> df
           i  ii
        0  1   4
        1  2   5
        2  3   6
        z
 column orz, and 1 identifies the columnsz or columns)Zextended_summary_subZaxis_description_subr�)r�r�cst�j|||d�S)N)r�r�)r�set_axis)r��labelsr�r�)rr�r�r	�s(zDataFrame.set_axisr
r�r�r�r�r�r�cs@t|||dd�}|j|�|jdd�|jdd�t�jf|�S)Nr
rr�)r/rar�rr)r�r�r�r�)rr�r�r�s

zDataFrame.reindex�raisec	st�j|||||||d�S)aA
        Drop specified labels from rows or columns.

        Remove rows or columns by specifying label names and corresponding
        axis, or by specifying directly index or column names. When using a
        multi-index, labels on different levels can be removed by specifying
        the level.

        Parameters
        ----------
        labels : single label or list-like
            Index or column labels to drop.
        axis : {0 or 'index', 1 or 'columns'}, default 0
            Whether to drop labels from the index (0 or 'index') or
            columns (1 or 'columns').
        index : single label or list-like
            Alternative to specifying axis (``labels, axis=0``
            is equivalent to ``index=labels``).
        columns : single label or list-like
            Alternative to specifying axis (``labels, axis=1``
            is equivalent to ``columns=labels``).
        level : int or level name, optional
            For MultiIndex, level from which the labels will be removed.
        inplace : bool, default False
            If False, return a copy. Otherwise, do operation
            inplace and return None.
        errors : {'ignore', 'raise'}, default 'raise'
            If 'ignore', suppress error and only existing labels are
            dropped.

        Returns
        -------
        DataFrame
            DataFrame without the removed index or column labels.

        Raises
        ------
        KeyError
            If any of the labels is not found in the selected axis.

        See Also
        --------
        DataFrame.loc : Label-location based indexer for selection by label.
        DataFrame.dropna : Return DataFrame with labels on given axis omitted
            where (all or any) data are missing.
        DataFrame.drop_duplicates : Return DataFrame with duplicate rows
            removed, optionally only considering certain columns.
        Series.drop : Return Series with specified index labels removed.

        Examples
        --------
        >>> df = pd.DataFrame(np.arange(12).reshape(3, 4),
        ...                   columns=['A', 'B', 'C', 'D'])
        >>> df
           A  B   C   D
        0  0  1   2   3
        1  4  5   6   7
        2  8  9  10  11

        Drop columns

        >>> df.drop(['B', 'C'], axis=1)
           A   D
        0  0   3
        1  4   7
        2  8  11

        >>> df.drop(columns=['B', 'C'])
           A   D
        0  0   3
        1  4   7
        2  8  11

        Drop a row by index

        >>> df.drop([0, 1])
           A  B   C   D
        2  8  9  10  11

        Drop columns and/or rows of MultiIndex DataFrame

        >>> midx = pd.MultiIndex(levels=[['lama', 'cow', 'falcon'],
        ...                              ['speed', 'weight', 'length']],
        ...                      codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
        ...                             [0, 1, 2, 0, 1, 2, 0, 1, 2]])
        >>> df = pd.DataFrame(index=midx, columns=['big', 'small'],
        ...                   data=[[45, 30], [200, 100], [1.5, 1], [30, 20],
        ...                         [250, 150], [1.5, 0.8], [320, 250],
        ...                         [1, 0.8], [0.3, 0.2]])
        >>> df
                        big     small
        lama    speed   45.0    30.0
                weight  200.0   100.0
                length  1.5     1.0
        cow     speed   30.0    20.0
                weight  250.0   150.0
                length  1.5     0.8
        falcon  speed   320.0   250.0
                weight  1.0     0.8
                length  0.3     0.2

        >>> df.drop(index='cow', columns='small')
                        big
        lama    speed   45.0
                weight  200.0
                length  1.5
        falcon  speed   320.0
                weight  1.0
                length  0.3

        >>> df.drop(index='length', level=1)
                        big     small
        lama    speed   45.0    30.0
                weight  200.0   100.0
        cow     speed   30.0    20.0
                weight  250.0   150.0
        falcon  speed   320.0   250.0
                weight  1.0     0.8
        )r
r�r�r�r�r��errors)rrb)r�r
r�r�r�r�r�r)rr�r�rb�szDataFrame.drop�mapperr�r�ignore)r�r�r�r�r�r�r)	r
r�r�r�r�r�r�rr�c	
st�j||||||||d�S)ai

        Alter axes labels.

        Function / dict values must be unique (1-to-1). Labels not contained in
        a dict / Series will be left as-is. Extra labels listed don't throw an
        error.

        See the :ref:`user guide <basics.rename>` for more.

        Parameters
        ----------
        mapper : dict-like or function
            Dict-like or functions transformations to apply to
            that axis' values. Use either ``mapper`` and ``axis`` to
            specify the axis to target with ``mapper``, or ``index`` and
            ``columns``.
        index : dict-like or function
            Alternative to specifying axis (``mapper, axis=0``
            is equivalent to ``index=mapper``).
        columns : dict-like or function
            Alternative to specifying axis (``mapper, axis=1``
            is equivalent to ``columns=mapper``).
        axis : {0 or 'index', 1 or 'columns'}, default 0
            Axis to target with ``mapper``. Can be either the axis name
            ('index', 'columns') or number (0, 1). The default is 'index'.
        copy : bool, default True
            Also copy underlying data.
        inplace : bool, default False
            Whether to return a new DataFrame. If True then value of copy is
            ignored.
        level : int or level name, default None
            In case of a MultiIndex, only rename labels in the specified
            level.
        errors : {'ignore', 'raise'}, default 'ignore'
            If 'raise', raise a `KeyError` when a dict-like `mapper`, `index`,
            or `columns` contains labels that are not present in the Index
            being transformed.
            If 'ignore', existing keys will be renamed and extra keys will be
            ignored.

        Returns
        -------
        DataFrame
            DataFrame with the renamed axis labels.

        Raises
        ------
        KeyError
            If any of the labels is not found in the selected axis and
            "errors='raise'".

        See Also
        --------
        DataFrame.rename_axis : Set the name of the axis.

        Examples
        --------
        ``DataFrame.rename`` supports two calling conventions

        * ``(index=index_mapper, columns=columns_mapper, ...)``
        * ``(mapper, axis={'index', 'columns'}, ...)``

        We *highly* recommend using keyword arguments to clarify your
        intent.

        Rename columns using a mapping:

        >>> df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
        >>> df.rename(columns={"A": "a", "B": "c"})
           a  c
        0  1  4
        1  2  5
        2  3  6

        Rename index using a mapping:

        >>> df.rename(index={0: "x", 1: "y", 2: "z"})
           A  B
        x  1  4
        y  2  5
        z  3  6

        Cast index labels to a different type:

        >>> df.index
        RangeIndex(start=0, stop=3, step=1)
        >>> df.rename(index=str).index
        Index(['0', '1', '2'], dtype='object')

        >>> df.rename(columns={"A": "a", "B": "b", "C": "c"}, errors="raise")
        Traceback (most recent call last):
        KeyError: ['C'] not found in axis

        Using axis-style parameters

        >>> df.rename(str.lower, axis='columns')
           a  b
        0  1  4
        1  2  5
        2  3  6

        >>> df.rename({1: 2, 2: 4}, axis='index')
           A  B
        0  1  4
        2  2  5
        4  3  6
        )r
r�r�r�r�r�r�r)rr)	r�r
r�r�r�r�r�r�r)rr�r�rQs{zDataFrame.renamecst�j||||||d�S)N)r�r�r�r�r��downcast)r�fillna)r�r�r�r�r�r�r)rr�r�r�s
zDataFrame.fillna)r�r�cst�j|d�S)a1
        Return item and drop from frame. Raise KeyError if not found.

        Parameters
        ----------
        item : label
            Label of column to be popped.

        Returns
        -------
        Series

        Examples
        --------
        >>> df = pd.DataFrame([('falcon', 'bird', 389.0),
        ...                    ('parrot', 'bird', 24.0),
        ...                    ('lion', 'mammal', 80.5),
        ...                    ('monkey', 'mammal', np.nan)],
        ...                   columns=('name', 'class', 'max_speed'))
        >>> df
             name   class  max_speed
        0  falcon    bird      389.0
        1  parrot    bird       24.0
        2    lion  mammal       80.5
        3  monkey  mammal        NaN

        >>> df.pop('class')
        0      bird
        1      bird
        2    mammal
        3    mammal
        Name: class, dtype: object

        >>> df
             name  max_speed
        0  falcon      389.0
        1  parrot       24.0
        2    lion       80.5
        3  monkey        NaN
        )r�)rr�)r�r�)rr�r�r��s)z
DataFrame.pop�padcst�j||||||d�S)N)�
to_replacer�r�r��regexr�)rr�)r�rr�r�r�rr�)rr�r�r�s
zDataFrame.replace)�mappingr�cCs�|r|n|j�}|j}xdtt|��D]T}|||kr$|jdd�|f}|||\}}	|j||	|d�}
|
|jdd�|f<q$W|r�dS|j|�S)aA
        Dispatch to Series.replace column-wise.


        Parameters
        ----------
        mapping : dict
            of the form {col: (target, value)}
        inplace : bool
        regex : bool or same types as `to_replace` in DataFrame.replace

        Returns
        -------
        DataFrame or None
        N)r)r�r�r�r�r�r�r�)r�rr�rr��axrZserr�r�Znewobjr�r�r��_replace_columnwise(szDataFrame._replace_columnwise)r�r�cst�j||||d�S)N)�periods�freqr�r�)r�shift)r�rrr�r�)rr�r�rKszDataFrame.shiftcCs�t|d�}t|t�s|g}d}g}x�|D]�}t|tttjttjf�r^t	|dd�dkr�t
|��q(y||jk}	Wn:tk
r�}
zt|�dt
|����|
�WYdd}
~
Xq(X|	s(|j|�q(W|r�td|�d���|r�|}n|j�}g}g}
|�r@t|jj�}
t|jt��r4x6t|jj�D]}|j|jj|���qWn|j|j�g}�x(|D�]}t|t��r�x$t|j�D]}|j|j|���qjW|
j|j�n�t|ttf��r�|j|�|
j|j�nzt|ttjf��r�|j|�|
jd�nRt|tj��r
|jt|��|
jd�n*|j||j�|
j|�|�r4|j|�t|d�t|�k�rLt
d	t|��d
t|d
������qLWt||
�}|�r�|j�r�||j�j�}t
d|����xt|�D]}||=�q�W|j �||_|�s�|SdS)a�
        Set the DataFrame index using existing columns.

        Set the DataFrame index (row labels) using one or more existing
        columns or arrays (of the correct length). The index can replace the
        existing index or expand on it.

        Parameters
        ----------
        keys : label or array-like or list of labels/arrays
            This parameter can be either a single column key, a single array of
            the same length as the calling DataFrame, or a list containing an
            arbitrary combination of column keys and arrays. Here, "array"
            encompasses :class:`Series`, :class:`Index`, ``np.ndarray``, and
            instances of :class:`~collections.abc.Iterator`.
        drop : bool, default True
            Delete columns to be used as the new index.
        append : bool, default False
            Whether to append columns to existing index.
        inplace : bool, default False
            Modify the DataFrame in place (do not create a new object).
        verify_integrity : bool, default False
            Check the new index for duplicates. Otherwise defer the check until
            necessary. Setting to False will improve the performance of this
            method.

        Returns
        -------
        DataFrame
            Changed row labels.

        See Also
        --------
        DataFrame.reset_index : Opposite of set_index.
        DataFrame.reindex : Change to new indices or expand indices.
        DataFrame.reindex_like : Change to same indices as other DataFrame.

        Examples
        --------
        >>> df = pd.DataFrame({'month': [1, 4, 7, 10],
        ...                    'year': [2012, 2014, 2013, 2014],
        ...                    'sale': [55, 40, 84, 31]})
        >>> df
           month  year  sale
        0      1  2012    55
        1      4  2014    40
        2      7  2013    84
        3     10  2014    31

        Set the index to become the 'month' column:

        >>> df.set_index('month')
               year  sale
        month
        1      2012    55
        4      2014    40
        7      2013    84
        10     2014    31

        Create a MultiIndex using columns 'year' and 'month':

        >>> df.set_index(['year', 'month'])
                    sale
        year  month
        2012  1     55
        2014  4     40
        2013  7     84
        2014  10    31

        Create a MultiIndex using an Index and a column:

        >>> df.set_index([pd.Index([1, 2, 3, 4]), 'year'])
                 month  sale
           year
        1  2012  1      55
        2  2014  4      40
        3  2013  7      84
        4  2014  10     31

        Create a MultiIndex using two Series:

        >>> s = pd.Series([1, 2, 3, 4])
        >>> df.set_index([s, s**2])
              month  year  sale
        1 1       1  2012    55
        2 4       4  2014    40
        3 9       7  2013    84
        4 16     10  2014    31
        r�z�The parameter "keys" may be a column key, one-dimensional array, or a list containing only valid column keys and one-dimensional arrays.r�r�z. Received column of type NzNone of z are in the columnszLength mismatch: Expected z  rows, received array of length zIndex has duplicate keys: r�r�)!r0r�r�rfrzr�r�rr
r�r�r�r�r rr`r�r�r�rjr�r�Z_get_level_valuesrr�rr�rhr�
duplicatedr�r^�_cleanup)r�r+rbrr�rq�err_msg�missingrW�foundr�r�r�r�rrXrmr��
duplicatesrgr�r�r��	set_indexQs|\







$
zDataFrame.set_indexr�)r�rbr��	col_level�col_fillr�cst|d�}|r�}n�j�}d
dd�}tjt|��}|dk	r~t|ttf�sP|g}�fdd�|D�}t|��jj	kr~�jj
|�}|�s�t�jt�r�dd�t�jj
�D�}	t�jj�jj�}
n8d�kr�dnd	}�jjdkr�|gn�jjg}	�jdff}
t�jt�}x�ttt|
���D]�\}
\}}|dk�p,|
|k�s4�q|	|
}|�r�t|t��rVt|�n|g}|dk�r�t|�d
�jj	fk�r�td|����|d}�jj|�}|g||}�jj	t|�}||g|7}t|�}|||�}|jd||��qW||_|�s|SdS)a
        Reset the index, or a level of it.

        Reset the index of the DataFrame, and use the default one instead.
        If the DataFrame has a MultiIndex, this method can remove one or more
        levels.

        Parameters
        ----------
        level : int, str, tuple, or list, default None
            Only remove the given levels from the index. Removes all levels by
            default.
        drop : bool, default False
            Do not try to insert index into dataframe columns. This resets
            the index to the default integer index.
        inplace : bool, default False
            Modify the DataFrame in place (do not create a new object).
        col_level : int or str, default 0
            If the columns have multiple levels, determines which level the
            labels are inserted into. By default it is inserted into the first
            level.
        col_fill : object, default ''
            If the columns have multiple levels, determines how the other
            levels are named. If None then the index name is repeated.

        Returns
        -------
        DataFrame or None
            DataFrame with the new index or None if ``inplace=True``.

        See Also
        --------
        DataFrame.set_index : Opposite of reset_index.
        DataFrame.reindex : Change to new indices or expand indices.
        DataFrame.reindex_like : Change to same indices as other DataFrame.

        Examples
        --------
        >>> df = pd.DataFrame([('bird', 389.0),
        ...                    ('bird', 24.0),
        ...                    ('mammal', 80.5),
        ...                    ('mammal', np.nan)],
        ...                   index=['falcon', 'parrot', 'lion', 'monkey'],
        ...                   columns=('class', 'max_speed'))
        >>> df
                 class  max_speed
        falcon    bird      389.0
        parrot    bird       24.0
        lion    mammal       80.5
        monkey  mammal        NaN

        When we reset the index, the old index is added as a column, and a
        new sequential index is used:

        >>> df.reset_index()
            index   class  max_speed
        0  falcon    bird      389.0
        1  parrot    bird       24.0
        2    lion  mammal       80.5
        3  monkey  mammal        NaN

        We can use the `drop` parameter to avoid the old index being added as
        a column:

        >>> df.reset_index(drop=True)
            class  max_speed
        0    bird      389.0
        1    bird       24.0
        2  mammal       80.5
        3  mammal        NaN

        You can also use `reset_index` with `MultiIndex`.

        >>> index = pd.MultiIndex.from_tuples([('bird', 'falcon'),
        ...                                    ('bird', 'parrot'),
        ...                                    ('mammal', 'lion'),
        ...                                    ('mammal', 'monkey')],
        ...                                   names=['class', 'name'])
        >>> columns = pd.MultiIndex.from_tuples([('speed', 'max'),
        ...                                      ('species', 'type')])
        >>> df = pd.DataFrame([(389.0, 'fly'),
        ...                    ( 24.0, 'fly'),
        ...                    ( 80.5, 'run'),
        ...                    (np.nan, 'jump')],
        ...                   index=index,
        ...                   columns=columns)
        >>> df
                       speed species
                         max    type
        class  name
        bird   falcon  389.0     fly
               parrot   24.0     fly
        mammal lion     80.5     run
               monkey    NaN    jump

        If the index has multiple levels, we can reset a subset of them:

        >>> df.reset_index(level='class')
                 class  speed species
                          max    type
        name
        falcon    bird  389.0     fly
        parrot    bird   24.0     fly
        lion    mammal   80.5     run
        monkey  mammal    NaN    jump

        If we are not dropping the index, by default, it is placed in the top
        level. We can place it in another level:

        >>> df.reset_index(level='class', col_level=1)
                        speed species
                 class    max    type
        name
        falcon    bird  389.0     fly
        parrot    bird   24.0     fly
        lion    mammal   80.5     run
        monkey  mammal    NaN    jump

        When the index is inserted under another level, we can specify under
        which one with the parameter `col_fill`:

        >>> df.reset_index(level='class', col_level=1, col_fill='species')
                      species  speed species
                        class    max    type
        name
        falcon           bird  389.0     fly
        parrot           bird   24.0     fly
        lion           mammal   80.5     run
        monkey         mammal    NaN    jump

        If we specify a nonexistent level for `col_fill`, it is created:

        >>> df.reset_index(level='class', col_level=1, col_fill='genus')
                        genus  speed species
                        class    max    type
        name
        falcon           bird  389.0     fly
        parrot           bird   24.0     fly
        lion           mammal   80.5     run
        monkey         mammal    NaN    jump
        r�Nc	Ss�|j}t|ttf�s*|jtjkr*tj|�}|dk	r�|dk}|j	dkrl|j
�rl|j}t|�}t|t
|�|�}nX|j|�}t|�}|j}t|t�r�|j}|j�r�t||tj�\}}t|t�r�|||d�}|S)Nr�r)r�r�)rr�rlrir�r�Zobject_rr�r��allrUr4r��taker r��DatetimeLikeArrayr�r�r=r�)	r�r
r�r�r�r�Zvalues_typeZvalues_dtyper�r�r�r��_maybe_casted_values�s*



z3DataFrame.reset_index.<locals>._maybe_casted_valuescsg|]}�jj|��qSr�)r��_get_level_number)r��lev)r�r�r�r��sz)DataFrame.reset_index.<locals>.<listcomp>cSs&g|]\}}|dk	r|nd|���qS)Nrfr�)r�rrmr�r�r�r��sr�Zlevel_0r�z:col_fill=None is incompatible with incomplete column name r)N)r0r�r�r�r�r�r�r�r�r�Z	droplevelrjrr�r
�levels�codesr�r��reversedr�r'r)r�r�rbr�r!r"�new_objr&rr�Z	to_insert�defaultZ	multi_colrr(Zlabr�Zcol_nameZlev_numZname_lstrZlevel_valuesr�)r�r��reset_index
sX

(

zDataFrame.reset_indexcCs"|j|jjtd��}|j|dd�S)N)�funcrT)r�)r�r�rTr�)r�r$r�r�r�rTszDataFrame.isnacCs|j�S)N)rT)r�r�r�r��isnullszDataFrame.isnullcCs
|j�S)N)rT)r�r�r�r�rVszDataFrame.notnacCs
|j�S)N)rT)r�r�r�r��notnullszDataFrame.notnullr�cCst|d�}t|ttf�r td��|j|�}d|}|}|dk	r�|j|�}|j|�}	|	dk}
|
j�rvt	tt
j|
|����|j|	|d�}|j
|d�}|dk	r�||k}nN|dkr�|t|j|��k}n2|dkr�|dk}n |dk	r�td	|����ntd
��|j|d�|}
|�r|j|
�n|
SdS)ae

        Remove missing values.

        See the :ref:`User Guide <missing_data>` for more on which values are
        considered missing, and how to work with missing data.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            Determine if rows or columns which contain missing values are
            removed.

            * 0, or 'index' : Drop rows which contain missing values.
            * 1, or 'columns' : Drop columns which contain missing value.

            .. versionchanged:: 1.0.0

               Pass tuple or list to drop on multiple axes.
               Only a single axis is allowed.

        how : {'any', 'all'}, default 'any'
            Determine if row or column is removed from DataFrame, when we have
            at least one NA or all NA.

            * 'any' : If any NA values are present, drop that row or column.
            * 'all' : If all values are NA, drop that row or column.

        thresh : int, optional
            Require that many non-NA values.
        subset : array-like, optional
            Labels along other axis to consider, e.g. if you are dropping rows
            these would be a list of columns to include.
        inplace : bool, default False
            If True, do operation inplace and return None.

        Returns
        -------
        DataFrame
            DataFrame with NA entries dropped from it.

        See Also
        --------
        DataFrame.isna: Indicate missing values.
        DataFrame.notna : Indicate existing (non-missing) values.
        DataFrame.fillna : Replace missing values.
        Series.dropna : Drop missing values.
        Index.dropna : Drop missing indices.

        Examples
        --------
        >>> df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
        ...                    "toy": [np.nan, 'Batmobile', 'Bullwhip'],
        ...                    "born": [pd.NaT, pd.Timestamp("1940-04-25"),
        ...                             pd.NaT]})
        >>> df
               name        toy       born
        0    Alfred        NaN        NaT
        1    Batman  Batmobile 1940-04-25
        2  Catwoman   Bullwhip        NaT

        Drop the rows where at least one element is missing.

        >>> df.dropna()
             name        toy       born
        1  Batman  Batmobile 1940-04-25

        Drop the columns where at least one element is missing.

        >>> df.dropna(axis='columns')
               name
        0    Alfred
        1    Batman
        2  Catwoman

        Drop the rows where all elements are missing.

        >>> df.dropna(how='all')
               name        toy       born
        0    Alfred        NaN        NaT
        1    Batman  Batmobile 1940-04-25
        2  Catwoman   Bullwhip        NaT

        Keep only the rows with at least 2 non-NA values.

        >>> df.dropna(thresh=2)
               name        toy       born
        1    Batman  Batmobile 1940-04-25
        2  Catwoman   Bullwhip        NaT

        Define in which columns to look for missing values.

        >>> df.dropna(subset=['name', 'born'])
               name        toy       born
        1    Batman  Batmobile 1940-04-25

        Keep the DataFrame with valid entries in the same variable.

        >>> df.dropna(inplace=True)
        >>> df
             name        toy       born
        1  Batman  Batmobile 1940-04-25
        r�z7supplying multiple axes to axis is no longer supported.r�N)r�r�r#rzinvalid how option: zmust specify how or threshr�)r0r�r�r�r��_get_axis_number�	_get_axisZget_indexer_forr�r`r��compressr$rlr�r�r�r�)r�r��howr��subsetr��agg_axisZagg_objr�indices�checkrlr�r$r�r�r��dropnas6g





zDataFrame.dropna�first)r6�keepr��ignore_indexr�cCs^|jr|j�St|d�}|j||d�}||}|rDtjt|��|_|rV|j|�dS|SdS)u�	
        Return DataFrame with duplicate rows removed.

        Considering certain columns is optional. Indexes, including time indexes
        are ignored.

        Parameters
        ----------
        subset : column label or sequence of labels, optional
            Only consider certain columns for identifying duplicates, by
            default use all of the columns.
        keep : {'first', 'last', False}, default 'first'
            Determines which duplicates (if any) to keep.
            - ``first`` : Drop duplicates except for the first occurrence.
            - ``last`` : Drop duplicates except for the last occurrence.
            - False : Drop all duplicates.
        inplace : bool, default False
            Whether to drop duplicates in place or to return a copy.
        ignore_index : bool, default False
            If True, the resulting axis will be labeled 0, 1, …, n - 1.

            .. versionadded:: 1.0.0

        Returns
        -------
        DataFrame
            DataFrame with duplicates removed or None if ``inplace=True``.

        See Also
        --------
        DataFrame.value_counts: Count unique combinations of columns.

        Examples
        --------
        Consider dataset containing ramen rating.

        >>> df = pd.DataFrame({
        ...     'brand': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
        ...     'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
        ...     'rating': [4, 4, 3.5, 15, 5]
        ... })
        >>> df
            brand style  rating
        0  Yum Yum   cup     4.0
        1  Yum Yum   cup     4.0
        2  Indomie   cup     3.5
        3  Indomie  pack    15.0
        4  Indomie  pack     5.0

        By default, it removes duplicate rows based on all columns.

        >>> df.drop_duplicates()
            brand style  rating
        0  Yum Yum   cup     4.0
        2  Indomie   cup     3.5
        3  Indomie  pack    15.0
        4  Indomie  pack     5.0

        To remove duplicates on specific column(s), use ``subset``.

        >>> df.drop_duplicates(subset=['brand'])
            brand style  rating
        0  Yum Yum   cup     4.0
        2  Indomie   cup     3.5

        To remove duplicates and keep last occurences, use ``keep``.

        >>> df.drop_duplicates(subset=['brand', 'style'], keep='last')
            brand style  rating
        1  Yum Yum   cup     4.0
        2  Indomie   cup     3.5
        4  Indomie  pack     5.0
        r�)r<N)	r�r�r0rr�r�r�r�r�)r�r6r<r�r=rr$r�r�r��drop_duplicates�sP


zDataFrame.drop_duplicatesrz)r6r<r�cs�ddlm�m}ddlm}�jr.�jtd�S��fdd�}�dkrL�j�n0t	j
��svt�t�svt�t
�r|��jkr|�f�tt���t��j�j�}|js�t|���fdd	��j�D�}tttt||���\}}	|||	d
d
d�}
�j||
|��jd�S)
a)

        Return boolean Series denoting duplicate rows.

        Considering certain columns is optional.

        Parameters
        ----------
        subset : column label or sequence of labels, optional
            Only consider certain columns for identifying duplicates, by
            default use all of the columns.
        keep : {'first', 'last', False}, default 'first'
            Determines which duplicates (if any) to mark.

            - ``first`` : Mark duplicates as ``True`` except for the first occurrence.
            - ``last`` : Mark duplicates as ``True`` except for the last occurrence.
            - False : Mark all duplicates as ``True``.

        Returns
        -------
        Series
            Boolean series for each duplicated rows.

        See Also
        --------
        Index.duplicated : Equivalent method on index.
        Series.duplicated : Equivalent method on Series.
        Series.drop_duplicates : Remove duplicate values from Series.
        DataFrame.drop_duplicates : Remove duplicate values from DataFrame.

        Examples
        --------
        Consider dataset containing ramen rating.

        >>> df = pd.DataFrame({
        ...     'brand': ['Yum Yum', 'Yum Yum', 'Indomie', 'Indomie', 'Indomie'],
        ...     'style': ['cup', 'cup', 'cup', 'pack', 'pack'],
        ...     'rating': [4, 4, 3.5, 15, 5]
        ... })
        >>> df
            brand style  rating
        0  Yum Yum   cup     4.0
        1  Yum Yum   cup     4.0
        2  Indomie   cup     3.5
        3  Indomie  pack    15.0
        4  Indomie  pack     5.0

        By default, for each set of duplicated values, the first occurrence
        is set on False and all others on True.

        >>> df.duplicated()
        0    False
        1     True
        2    False
        3    False
        4    False
        dtype: bool

        By using 'last', the last occurrence of each set of duplicated values
        is set on False and all others on True.

        >>> df.duplicated(keep='last')
        0     True
        1    False
        2    False
        3    False
        4    False
        dtype: bool

        By setting ``keep`` on False, all duplicates are True.

        >>> df.duplicated(keep=False)
        0     True
        1     True
        2    False
        3    False
        4    False
        dtype: bool

        To find duplicates on specific column(s), use ``subset``.

        >>> df.duplicated(subset=['brand'])
        0    False
        1     True
        2    False
        3     True
        4     True
        dtype: bool
        r)�_SIZE_HINT_LIMIT�duplicated_int64)�get_group_index)r�cs2tj|tt����d�\}}|jddd�t|�fS)N)Z	size_hint�i8F)r�)rWZ	factorizer�r�r�)�valsr
r�)r?r�r�r��fhszDataFrame.duplicated.<locals>.fNc3s |]\}}|�kr|jVqdS)N)r�)r�r�rW)r6r�r�r��sz'DataFrame.duplicated.<locals>.<genexpr>F)�sortZxnull)r�)Zpandas._libs.hashtabler?r@�pandas.core.sortingrAr�r	r�r�r��iterabler�r�r�rrrf�
differencer`rrr�r
r�)r�r6r<r@rArD�diffrCr
r��idsr�)r?r�r6r�rs(]



zDataFrame.duplicated�	quicksort�lastc	szt|d�}�j���t|t�s$|g}t|�rZt|�t|�krZtdt|��dt|��d���t|�dkr�ddlm}	��fdd	�|D�}
|dk	r�d
d	�t	|
|�D�}
|	|
|||d�}t
|�}n`ddlm}|d}�j|�d
�}
|dk	r�t
|
|d�}
t|ttf��r
|d}||
||||d�}�jj|�j��dd�}|�rNtjt|��|jd<�j|�}|�rh�j|�S|j�dd�SdS)Nr�zLength of ascending (z) != length of by (�)r�r)�lexsort_indexercsg|]}�j|�d��qS))r�)�_get_label_or_level_values)r�rV)r�r�r�r�r��sz)DataFrame.sort_values.<locals>.<listcomp>cSsg|]\}}t||d��qS))r�)rz)r�r�r�r�r�r�r��s)�orders�na_positionr�)�nargsort)r�)r�)�kind�	ascendingrQr�F)r��verify�sort_values)r�)r0r2r�r�rQr�r�rFrNr
r@rRrOrzr�r�r$�_get_block_manager_axisr�r�r�r�r�r�)r��byr�rTr�rSrQr=r�rNr+r�rRr��new_datar$r�)r�r�r�rV�s@





zDataFrame.sort_values)rTr�rSrQ�sort_remainingr=r�c
Cs4t|d�}|j|�}|j|�}
t|
|	|d�}
|
j�}
|dk	rR|
j|||d�\}}nlt|
t�r|ddlm	}
|
|
j
�||d�}nBddlm}|r�|
js�|r�|
j
r�|r�dS|j�S||
|||d	�}|j|�}|jj||d
d�}|j|j�|j|<|�rtjt|��|jd<|j|�}|�r"|j|�S|j|d
d�SdS)uJ

        Sort object by labels (along an axis).

        Returns a new DataFrame sorted by label if `inplace` argument is
        ``False``, otherwise updates the original DataFrame and returns None.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis along which to sort.  The value 0 identifies the rows,
            and 1 identifies the columns.
        level : int or level name or list of ints or list of level names
            If not None, sort on values in specified index level(s).
        ascending : bool or list of bools, default True
            Sort ascending vs. descending. When the index is a MultiIndex the
            sort direction can be controlled for each level individually.
        inplace : bool, default False
            If True, perform operation in-place.
        kind : {'quicksort', 'mergesort', 'heapsort'}, default 'quicksort'
            Choice of sorting algorithm. See also ndarray.np.sort for more
            information.  `mergesort` is the only stable algorithm. For
            DataFrames, this option is only applied when sorting on a single
            column or label.
        na_position : {'first', 'last'}, default 'last'
            Puts NaNs at the beginning if `first`; `last` puts NaNs at the end.
            Not implemented for MultiIndex.
        sort_remaining : bool, default True
            If True and sorting by level and index is multilevel, sort by other
            levels too (in order) after sorting by specified level.
        ignore_index : bool, default False
            If True, the resulting axis will be labeled 0, 1, …, n - 1.

            .. versionadded:: 1.0.0

        key : callable, optional
            If not None, apply the key function to the index values
            before sorting. This is similar to the `key` argument in the
            builtin :meth:`sorted` function, with the notable difference that
            this `key` function should be *vectorized*. It should expect an
            ``Index`` and return an ``Index`` of the same shape. For MultiIndex
            inputs, the key is applied *per level*.

            .. versionadded:: 1.1.0

        Returns
        -------
        DataFrame
            The original DataFrame sorted by the labels.

        See Also
        --------
        Series.sort_index : Sort Series by the index.
        DataFrame.sort_values : Sort DataFrame by the value.
        Series.sort_values : Sort Series by the value.

        Examples
        --------
        >>> df = pd.DataFrame([1, 2, 3, 4, 5], index=[100, 29, 234, 1, 150],
        ...                   columns=['A'])
        >>> df.sort_index()
             A
        1    4
        29   2
        100  1
        150  5
        234  3

        By default, it sorts in ascending order, to sort in descending order,
        use ``ascending=False``

        >>> df.sort_index(ascending=False)
             A
        234  3
        150  5
        100  1
        29   2
        1    4

        A key function can be specified which is applied to the index before
        sorting. For a ``MultiIndex`` this is applied to each level separately.

        >>> df = pd.DataFrame({"a": [1, 2, 3, 4]}, index=['A', 'b', 'C', 'd'])
        >>> df.sort_index(key=lambda x: x.str.lower())
           a
        A  1
        b  2
        C  3
        d  4
        r�)r)N)rTrZr)rN)rPrQ)rR)rSrTrQF)r�rUr��
sort_index)r�)r0r2r3r{Z_sort_levels_monotonicZ	sortlevelr�rjrFrNZ_get_codes_for_sortingrRZis_monotonic_increasingZis_monotonic_decreasingr�rWr�r$r�r�r�r�r�r�r�)r�r�r�rTr�rSrQrZr=r�r
Znew_axisr�rNrRZbaxisrYr$r�r�r�r[�s>h








zDataFrame.sort_index)r6�	normalizerErTcCsl|dkr|jj�}|j|�jj�}|r2|j|d�}|rB||j�}t|�dkrhtj	|j
g|j
jgd�|_
|S)ag
        Return a Series containing counts of unique rows in the DataFrame.

        .. versionadded:: 1.1.0

        Parameters
        ----------
        subset : list-like, optional
            Columns to use when counting unique combinations.
        normalize : bool, default False
            Return proportions rather than frequencies.
        sort : bool, default True
            Sort by frequencies.
        ascending : bool, default False
            Sort in ascending order.

        Returns
        -------
        Series

        See Also
        --------
        Series.value_counts: Equivalent method on Series.

        Notes
        -----
        The returned Series will have a MultiIndex with one level per input
        column. By default, rows that contain any NA values are omitted from
        the result. By default, the resulting Series will be in descending
        order so that the first element is the most frequently-occurring row.

        Examples
        --------
        >>> df = pd.DataFrame({'num_legs': [2, 4, 4, 6],
        ...                    'num_wings': [2, 0, 0, 0]},
        ...                   index=['falcon', 'dog', 'cat', 'ant'])
        >>> df
                num_legs  num_wings
        falcon         2          2
        dog            4          0
        cat            4          0
        ant            6          0

        >>> df.value_counts()
        num_legs  num_wings
        4         0            2
        6         0            1
        2         2            1
        dtype: int64

        >>> df.value_counts(sort=False)
        num_legs  num_wings
        2         2            1
        4         0            2
        6         0            1
        dtype: int64

        >>> df.value_counts(ascending=True)
        num_legs  num_wings
        2         2            1
        6         0            1
        4         0            2
        dtype: int64

        >>> df.value_counts(normalize=True)
        num_legs  num_wings
        4         0            0.50
        6         0            0.25
        2         2            0.25
        dtype: float64
        N)rTr�)r�)r�r:�groupbyZgrouperr�rV�sumr�rjZfrom_arraysr�r�)r�r6r\rErT�countsr�r�r��value_countsjsN
zDataFrame.value_countscCstj||||d�j�S)a�
        Return the first `n` rows ordered by `columns` in descending order.

        Return the first `n` rows with the largest values in `columns`, in
        descending order. The columns that are not specified are returned as
        well, but not used for ordering.

        This method is equivalent to
        ``df.sort_values(columns, ascending=False).head(n)``, but more
        performant.

        Parameters
        ----------
        n : int
            Number of rows to return.
        columns : label or list of labels
            Column label(s) to order by.
        keep : {'first', 'last', 'all'}, default 'first'
            Where there are duplicate values:

            - `first` : prioritize the first occurrence(s)
            - `last` : prioritize the last occurrence(s)
            - ``all`` : do not drop any duplicates, even it means
                        selecting more than `n` items.

            .. versionadded:: 0.24.0

        Returns
        -------
        DataFrame
            The first `n` rows ordered by the given columns in descending
            order.

        See Also
        --------
        DataFrame.nsmallest : Return the first `n` rows ordered by `columns` in
            ascending order.
        DataFrame.sort_values : Sort DataFrame by the values.
        DataFrame.head : Return the first `n` rows without re-ordering.

        Notes
        -----
        This function cannot be used with all column types. For example, when
        specifying columns with `object` or `category` dtypes, ``TypeError`` is
        raised.

        Examples
        --------
        >>> df = pd.DataFrame({'population': [59000000, 65000000, 434000,
        ...                                   434000, 434000, 337000, 11300,
        ...                                   11300, 11300],
        ...                    'GDP': [1937894, 2583560 , 12011, 4520, 12128,
        ...                            17036, 182, 38, 311],
        ...                    'alpha-2': ["IT", "FR", "MT", "MV", "BN",
        ...                                "IS", "NR", "TV", "AI"]},
        ...                   index=["Italy", "France", "Malta",
        ...                          "Maldives", "Brunei", "Iceland",
        ...                          "Nauru", "Tuvalu", "Anguilla"])
        >>> df
                  population      GDP alpha-2
        Italy       59000000  1937894      IT
        France      65000000  2583560      FR
        Malta         434000    12011      MT
        Maldives      434000     4520      MV
        Brunei        434000    12128      BN
        Iceland       337000    17036      IS
        Nauru          11300      182      NR
        Tuvalu         11300       38      TV
        Anguilla       11300      311      AI

        In the following example, we will use ``nlargest`` to select the three
        rows having the largest values in column "population".

        >>> df.nlargest(3, 'population')
                population      GDP alpha-2
        France    65000000  2583560      FR
        Italy     59000000  1937894      IT
        Malta       434000    12011      MT

        When using ``keep='last'``, ties are resolved in reverse order:

        >>> df.nlargest(3, 'population', keep='last')
                population      GDP alpha-2
        France    65000000  2583560      FR
        Italy     59000000  1937894      IT
        Brunei      434000    12128      BN

        When using ``keep='all'``, all duplicate items are maintained:

        >>> df.nlargest(3, 'population', keep='all')
                  population      GDP alpha-2
        France      65000000  2583560      FR
        Italy       59000000  1937894      IT
        Malta         434000    12011      MT
        Maldives      434000     4520      MV
        Brunei        434000    12128      BN

        To order by the largest values in column "population" and then "GDP",
        we can specify multiple columns like in the next example.

        >>> df.nlargest(3, ['population', 'GDP'])
                population      GDP alpha-2
        France    65000000  2583560      FR
        Italy     59000000  1937894      IT
        Brunei      434000    12128      BN
        )rmr<r�)rW�SelectNFrame�nlargest)r�rmr�r<r�r�r�rb�skzDataFrame.nlargestcCstj||||d�j�S)a
        Return the first `n` rows ordered by `columns` in ascending order.

        Return the first `n` rows with the smallest values in `columns`, in
        ascending order. The columns that are not specified are returned as
        well, but not used for ordering.

        This method is equivalent to
        ``df.sort_values(columns, ascending=True).head(n)``, but more
        performant.

        Parameters
        ----------
        n : int
            Number of items to retrieve.
        columns : list or str
            Column name or names to order by.
        keep : {'first', 'last', 'all'}, default 'first'
            Where there are duplicate values:

            - ``first`` : take the first occurrence.
            - ``last`` : take the last occurrence.
            - ``all`` : do not drop any duplicates, even it means
              selecting more than `n` items.

            .. versionadded:: 0.24.0

        Returns
        -------
        DataFrame

        See Also
        --------
        DataFrame.nlargest : Return the first `n` rows ordered by `columns` in
            descending order.
        DataFrame.sort_values : Sort DataFrame by the values.
        DataFrame.head : Return the first `n` rows without re-ordering.

        Examples
        --------
        >>> df = pd.DataFrame({'population': [59000000, 65000000, 434000,
        ...                                   434000, 434000, 337000, 337000,
        ...                                   11300, 11300],
        ...                    'GDP': [1937894, 2583560 , 12011, 4520, 12128,
        ...                            17036, 182, 38, 311],
        ...                    'alpha-2': ["IT", "FR", "MT", "MV", "BN",
        ...                                "IS", "NR", "TV", "AI"]},
        ...                   index=["Italy", "France", "Malta",
        ...                          "Maldives", "Brunei", "Iceland",
        ...                          "Nauru", "Tuvalu", "Anguilla"])
        >>> df
                  population      GDP alpha-2
        Italy       59000000  1937894      IT
        France      65000000  2583560      FR
        Malta         434000    12011      MT
        Maldives      434000     4520      MV
        Brunei        434000    12128      BN
        Iceland       337000    17036      IS
        Nauru         337000      182      NR
        Tuvalu         11300       38      TV
        Anguilla       11300      311      AI

        In the following example, we will use ``nsmallest`` to select the
        three rows having the smallest values in column "population".

        >>> df.nsmallest(3, 'population')
                  population    GDP alpha-2
        Tuvalu         11300     38      TV
        Anguilla       11300    311      AI
        Iceland       337000  17036	     IS

        When using ``keep='last'``, ties are resolved in reverse order:

        >>> df.nsmallest(3, 'population', keep='last')
                  population  GDP alpha-2
        Anguilla       11300  311      AI
        Tuvalu         11300   38      TV
        Nauru         337000  182      NR

        When using ``keep='all'``, all duplicate items are maintained:

        >>> df.nsmallest(3, 'population', keep='all')
                  population    GDP alpha-2
        Tuvalu         11300     38      TV
        Anguilla       11300    311      AI
        Iceland       337000  17036      IS
        Nauru         337000    182      NR

        To order by the smallest values in column "population" and then "GDP", we can
        specify multiple columns like in the next example.

        >>> df.nsmallest(3, ['population', 'GDP'])
                  population  GDP alpha-2
        Tuvalu         11300   38      TV
        Anguilla       11300  311      AI
        Nauru         337000  182      NR
        )rmr<r�)rWra�	nsmallest)r�rmr�r<r�r�r�rc7sbzDataFrame.nsmallestr�cCsx|j�}|j|�}t|j|�t�s*td��|dkrTt|jt�sBt�|jj||�|_n t|j	t�sdt�|j	j||�|_	|S)a�
        Swap levels i and j in a MultiIndex on a particular axis.

        Parameters
        ----------
        i, j : int or str
            Levels of the indices to be swapped. Can pass level name as string.
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to swap levels on. 0 or 'index' for row-wise, 1 or
            'columns' for column-wise.

        Returns
        -------
        DataFrame
        z,Can only swap levels on a hierarchical axis.r)
r�r2r�r3rjr�r�r��	swaplevelr�)r�r�jr�r$r�r�r�rd�s
zDataFrame.swaplevelcCst|j|�}t|j|�t�s"td��|j�}|dkrRt|jt�sBt�|jj|�|_nt|j	t�sbt�|j	j|�|_	|S)a�
        Rearrange index levels using input order. May not drop or duplicate levels.

        Parameters
        ----------
        order : list of int or list of str
            List representing new level order. Reference level by number
            (position) or by key (label).
        axis : {0 or 'index', 1 or 'columns'}, default 0
            Where to reorder levels.

        Returns
        -------
        DataFrame
        z/Can only reorder levels on a hierarchical axis.r)
r2r�r3rjr�r�r�r��reorder_levelsr�)r��orderr�r$r�r�r�rf�s
zDataFrame.reorder_levels)r!cs.�dkr�}n��fdd�}tj|||�}|S)Ncstj||��\}}�||�S)N)rZZ
fill_binop)r"r#)r�r/r�r��	_arith_op�sz+DataFrame._combine_frame.<locals>._arith_op)rZZdispatch_to_series)r�r!r/r�rhrYr�)r�r/r��_combine_frame�s
zDataFrame._combine_framecCs"|j|dd�}|j|_|j|_|S)z�
        Wrap the result of an arithmetic, comparison, or logical operation.

        Parameters
        ----------
        result : DataFrame

        Returns
        -------
        DataFrame
        F)r�)r�r�r�)r�r$�outr�r�r��_construct_result�szDataFrame._construct_resulta
Returns
-------
DataFrame
    DataFrame that shows the differences stacked side by side.

    The resulting index will be a MultiIndex with 'self' and 'other'
    stacked alternately at the inner level.

See Also
--------
Series.compare : Compare with another Series and show differences.

Notes
-----
Matching NaNs will not appear as a difference.

Examples
--------
>>> df = pd.DataFrame(
...     {
...         "col1": ["a", "a", "b", "b", "a"],
...         "col2": [1.0, 2.0, 3.0, np.nan, 5.0],
...         "col3": [1.0, 2.0, 3.0, 4.0, 5.0]
...     },
...     columns=["col1", "col2", "col3"],
... )
>>> df
  col1  col2  col3
0    a   1.0   1.0
1    a   2.0   2.0
2    b   3.0   3.0
3    b   NaN   4.0
4    a   5.0   5.0

>>> df2 = df.copy()
>>> df2.loc[0, 'col1'] = 'c'
>>> df2.loc[2, 'col3'] = 4.0
>>> df2
  col1  col2  col3
0    c   1.0   1.0
1    a   2.0   2.0
2    b   3.0   4.0
3    b   NaN   4.0
4    a   5.0   5.0

Align the differences on columns

>>> df.compare(df2)
  col1       col3
  self other self other
0    a     c  NaN   NaN
2  NaN   NaN  3.0   4.0

Stack the differences on rows

>>> df.compare(df2, align_axis=0)
        col1  col3
0 self     a   NaN
  other    c   NaN
2 self   NaN   3.0
  other  NaN   4.0

Keep the equal values

>>> df.compare(df2, keep_equal=True)
  col1       col3
  self other self other
0    a     c  1.0   1.0
2    b     b  3.0   4.0

Keep all original rows and columns

>>> df.compare(df2, keep_shape=True)
  col1       col2       col3
  self other self other self other
0    a     c  NaN   NaN  NaN   NaN
1  NaN   NaN  NaN   NaN  NaN   NaN
2  NaN   NaN  NaN   NaN  3.0   4.0
3  NaN   NaN  NaN   NaN  NaN   NaN
4  NaN   NaN  NaN   NaN  NaN   NaN

Keep all original rows and columns and also all original values

>>> df.compare(df2, keep_shape=True, keep_equal=True)
  col1       col2       col3
  self other self other self other
0    a     c  1.0   1.0  1.0   1.0
1    a     a  2.0   2.0  2.0   2.0
2    b     b  3.0   3.0  3.0   4.0
3    b     b  NaN   NaN  4.0   4.0
4    a     a  5.0   5.0  5.0   5.0
�compare)r!�
align_axis�
keep_shape�
keep_equalr�cst�j||||d�S)N)r!rmrnro)rrl)r�r!rmrnro)rr�r�rls
gzDataFrame.compare)r!r�cCs�t|j�}|j|dd�\}}|j}|jrBt|�t|j�krB|j�S|jr\t|�|kr\|j�S|jj|j�}|dk	}	i}
�x|D�]}||}||}
|j}|
j}t|�}t|
�}|r�|j	�r�||j�|
|<q~|	r�|j�}|
j�}
|||<||
|<||jk�r0|}y|j
|dd�}Wntk
�r,YnXn8t||g�}t
||��sR|j
|�}t
||��sh|
j
|�}
|||
�}t||�}||
|<q~W|j|
||d�S)a�

        Perform column-wise combine with another DataFrame.

        Combines a DataFrame with `other` DataFrame using `func`
        to element-wise combine columns. The row and column indexes of the
        resulting DataFrame will be the union of the two.

        Parameters
        ----------
        other : DataFrame
            The DataFrame to merge column-wise.
        func : function
            Function that takes two series as inputs and return a Series or a
            scalar. Used to merge the two dataframes column by columns.
        fill_value : scalar value, default None
            The value to fill NaNs with prior to passing any column to the
            merge func.
        overwrite : bool, default True
            If True, columns in `self` that do not exist in `other` will be
            overwritten with NaNs.

        Returns
        -------
        DataFrame
            Combination of the provided DataFrames.

        See Also
        --------
        DataFrame.combine_first : Combine two DataFrame objects and default to
            non-null values in frame calling the method.

        Examples
        --------
        Combine using a simple function that chooses the smaller column.

        >>> df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]})
        >>> df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]})
        >>> take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2
        >>> df1.combine(df2, take_smaller)
           A  B
        0  0  3
        1  0  3

        Example using a true element-wise combine function.

        >>> df1 = pd.DataFrame({'A': [5, 0], 'B': [2, 4]})
        >>> df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]})
        >>> df1.combine(df2, np.minimum)
           A  B
        0  1  2
        1  0  3

        Using `fill_value` fills Nones prior to passing the column to the
        merge function.

        >>> df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]})
        >>> df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]})
        >>> df1.combine(df2, take_smaller, fill_value=-5)
           A    B
        0  0 -5.0
        1  0  4.0

        However, if the same element in both dataframes is None, that None
        is preserved

        >>> df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]})
        >>> df2 = pd.DataFrame({'A': [1, 1], 'B': [None, 3]})
        >>> df1.combine(df2, take_smaller, fill_value=-5)
            A    B
        0  0 -5.0
        1  0  3.0

        Example that demonstrates the use of `overwrite` and behavior when
        the axis differ between the dataframes.

        >>> df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]})
        >>> df2 = pd.DataFrame({'B': [3, 3], 'C': [-10, 1], }, index=[1, 2])
        >>> df1.combine(df2, take_smaller)
             A    B     C
        0  NaN  NaN   NaN
        1  NaN  3.0 -10.0
        2  NaN  3.0   1.0

        >>> df1.combine(df2, take_smaller, overwrite=False)
             A    B     C
        0  0.0  NaN   NaN
        1  0.0  3.0 -10.0
        2  NaN  3.0   1.0

        Demonstrating the preference of the passed in dataframe.

        >>> df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1], }, index=[1, 2])
        >>> df2.combine(df1, take_smaller)
           A    B   C
        0  0.0  NaN NaN
        1  0.0  3.0 NaN
        2  NaN  3.0 NaN

        >>> df2.combine(df1, take_smaller, overwrite=False)
             A    B   C
        0  0.0  NaN NaN
        1  0.0  3.0 1.0
        2  NaN  3.0 1.0
        F)r�N)r�r�)r�r�rr�r�r�rr�rTr#r�r�r5rFr:r�)r�r!r/r��	overwriteZother_idxlen�thisrr�Zdo_fillr$rWr5ZotherSeriesZ
this_dtypeZother_dtypeZ	this_maskZ
other_maskZ	new_dtyper�r�r�r��combinersPk




zDataFrame.combinecs6ddljjj�dd�����fdd�}�j||dd�S)	a�
        Update null elements with value in the same location in `other`.

        Combine two DataFrame objects by filling null values in one DataFrame
        with non-null values from other DataFrame. The row and column indexes
        of the resulting DataFrame will be the union of the two.

        Parameters
        ----------
        other : DataFrame
            Provided DataFrame to use to fill null values.

        Returns
        -------
        DataFrame

        See Also
        --------
        DataFrame.combine : Perform series-wise operation on two DataFrames
            using a given function.

        Examples
        --------
        >>> df1 = pd.DataFrame({'A': [None, 0], 'B': [None, 4]})
        >>> df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]})
        >>> df1.combine_first(df2)
             A    B
        0  1.0  3.0
        1  0.0  4.0

        Null values still persist if the location of that null value
        does not exist in `other`

        >>> df1 = pd.DataFrame({'A': [None, 0], 'B': [4, None]})
        >>> df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1]}, index=[1, 2])
        >>> df1.combine_first(df2)
             A    B    C
        0  NaN  4.0  NaN
        1  0.0  3.0  1.0
        2  NaN  3.0  1.0
        rNcSs>t|ttf�r|j}t|j�r:t|j�r0|j}n
|jd�}|S)NrB)	r�rfrzrrRr�rGZasi8�view)r�r�r�r��extract_valuesGs


z/DataFrame.combine_first.<locals>.extract_valuescsJt|�}t|ttf�r|j}�|�}�|�}|j�jkr<|S�j|||�S)N)rTr�rfrzrr�r�r�)rV�yr�Zx_valuesZy_values)�expressionsrtr�r�r��combinerVsz)DataFrame.combine_first.<locals>.combinerF)rp)�#pandas.core.computation.expressions�core�computationrvrr)r�r!rwr�)rvrtr�r��
combine_firsts*zDataFrame.combine_firstr"c
Csddljjj}|dkrtd��|d
kr.td��t|t�s@t|�}|j|�}x�|j	D]�}||j
}||j
}	|dk	r�tjdd��||�t
|	�B}
WdQRXnB|dkr�t|	�}t|�}t||@�r�td	��|r�t
|	�}
nt|�}
|
j�r�qR|j|
||	�||<qRWdS)a�
        Modify in place using non-NA values from another DataFrame.

        Aligns on indices. There is no return value.

        Parameters
        ----------
        other : DataFrame, or object coercible into a DataFrame
            Should have at least one matching index/column label
            with the original DataFrame. If a Series is passed,
            its name attribute must be set, and that will be
            used as the column name to align with the original DataFrame.
        join : {'left'}, default 'left'
            Only left join is implemented, keeping the index and columns of the
            original object.
        overwrite : bool, default True
            How to handle non-NA values for overlapping keys:

            * True: overwrite original DataFrame's values
              with values from `other`.
            * False: only update values that are NA in
              the original DataFrame.

        filter_func : callable(1d-array) -> bool 1d-array, optional
            Can choose to replace values other than NA. Return True for values
            that should be updated.
        errors : {'raise', 'ignore'}, default 'ignore'
            If 'raise', will raise a ValueError if the DataFrame and `other`
            both contain non-NA data in the same place.

            .. versionchanged:: 0.24.0
               Changed from `raise_conflict=False|True`
               to `errors='ignore'|'raise'`.

        Returns
        -------
        None : method directly changes calling object

        Raises
        ------
        ValueError
            * When `errors='raise'` and there's overlapping non-NA data.
            * When `errors` is not either `'ignore'` or `'raise'`
        NotImplementedError
            * If `join != 'left'`

        See Also
        --------
        dict.update : Similar method for dictionaries.
        DataFrame.merge : For column(s)-on-columns(s) operations.

        Examples
        --------
        >>> df = pd.DataFrame({'A': [1, 2, 3],
        ...                    'B': [400, 500, 600]})
        >>> new_df = pd.DataFrame({'B': [4, 5, 6],
        ...                        'C': [7, 8, 9]})
        >>> df.update(new_df)
        >>> df
           A  B
        0  1  4
        1  2  5
        2  3  6

        The DataFrame's length does not increase as a result of the update,
        only values at matching index/column labels are updated.

        >>> df = pd.DataFrame({'A': ['a', 'b', 'c'],
        ...                    'B': ['x', 'y', 'z']})
        >>> new_df = pd.DataFrame({'B': ['d', 'e', 'f', 'g', 'h', 'i']})
        >>> df.update(new_df)
        >>> df
           A  B
        0  a  d
        1  b  e
        2  c  f

        For Series, it's name attribute must be set.

        >>> df = pd.DataFrame({'A': ['a', 'b', 'c'],
        ...                    'B': ['x', 'y', 'z']})
        >>> new_column = pd.Series(['d', 'e'], name='B', index=[0, 2])
        >>> df.update(new_column)
        >>> df
           A  B
        0  a  d
        1  b  y
        2  c  e
        >>> df = pd.DataFrame({'A': ['a', 'b', 'c'],
        ...                    'B': ['x', 'y', 'z']})
        >>> new_df = pd.DataFrame({'B': ['d', 'e']}, index=[1, 2])
        >>> df.update(new_df)
        >>> df
           A  B
        0  a  x
        1  b  d
        2  c  e

        If `other` contains NaNs the corresponding values are not updated
        in the original dataframe.

        >>> df = pd.DataFrame({'A': [1, 2, 3],
        ...                    'B': [400, 500, 600]})
        >>> new_df = pd.DataFrame({'B': [4, np.nan, 6]})
        >>> df.update(new_df)
        >>> df
           A      B
        0  1    4.0
        1  2  500.0
        2  3    6.0
        rNr"zOnly left join is supportedrrz7The parameter errors must be either 'ignore' or 'raise')r#zData overlaps.)rr)rxryrzrvr�r�r�r��reindex_liker�rr��errstaterTrVr�r#r�)
r�r!rrpZfilter_funcrrvrWrq�thatr�Z	mask_thisZ	mask_thatr�r�r�rahs2r




zDataFrame.updatea
Examples
--------
>>> df = pd.DataFrame({'Animal': ['Falcon', 'Falcon',
...                               'Parrot', 'Parrot'],
...                    'Max Speed': [380., 370., 24., 26.]})
>>> df
   Animal  Max Speed
0  Falcon      380.0
1  Falcon      370.0
2  Parrot       24.0
3  Parrot       26.0
>>> df.groupby(['Animal']).mean()
        Max Speed
Animal
Falcon      375.0
Parrot       25.0

**Hierarchical Indexes**

We can groupby different levels of a hierarchical index
using the `level` parameter:

>>> arrays = [['Falcon', 'Falcon', 'Parrot', 'Parrot'],
...           ['Captive', 'Wild', 'Captive', 'Wild']]
>>> index = pd.MultiIndex.from_arrays(arrays, names=('Animal', 'Type'))
>>> df = pd.DataFrame({'Max Speed': [390., 350., 30., 20.]},
...                   index=index)
>>> df
                Max Speed
Animal Type
Falcon Captive      390.0
       Wild         350.0
Parrot Captive       30.0
       Wild          20.0
>>> df.groupby(level=0).mean()
        Max Speed
Animal
Falcon      370.0
Parrot       25.0
>>> df.groupby(level="Type").mean()
         Max Speed
Type
Captive      210.0
Wild         185.0

We can also choose to include NA in group keys or not by setting
`dropna` parameter, the default setting is `True`:

>>> l = [[1, 2, 3], [1, None, 4], [2, 1, 3], [1, 2, 2]]
>>> df = pd.DataFrame(l, columns=["a", "b", "c"])

>>> df.groupby(by=["b"]).sum()
    a   c
b
1.0 2   3
2.0 2   5

>>> df.groupby(by=["b"], dropna=False).sum()
    a   c
b
1.0 2   3
2.0 2   5
NaN 1   4

>>> l = [["a", 12, 12], [None, 12.3, 33.], ["b", 12.3, 123], ["a", 1, 1]]
>>> df = pd.DataFrame(l, columns=["a", "b", "c"])

>>> df.groupby(by="a").sum()
    b     c
a
a   13.0   13.0
b   12.3  123.0

>>> df.groupby(by="a", dropna=False).sum()
    b     c
a
a   13.0   13.0
b   12.3  123.0
NaN 12.3   33.0
r]r�)�as_indexrE�
group_keys�squeeze�observedr:r�c
Cshddlm}
|tk	r&tjdtdd�nd}|dkrB|dkrBtd��|j|�}|
||||||||||	d�
S)	Nr)r�zNThe `squeeze` parameter is deprecated and will be removed in a future version.r�)r3Fz*You have to supply one of 'by' and 'level')
�objr+r�r�rrEr�r�r�r:)�pandas.core.groupby.genericr�rr@rArDr�r2)r�rXr�r�rrEr�r�r�r:r�r�r�r�r]s*`

zDataFrame.groupbya�
        Return reshaped DataFrame organized by given index / column values.

        Reshape data (produce a "pivot" table) based on column values. Uses
        unique values from specified `index` / `columns` to form axes of the
        resulting DataFrame. This function does not support data
        aggregation, multiple values will result in a MultiIndex in the
        columns. See the :ref:`User Guide <reshaping>` for more on reshaping.

        Parameters
        ----------%s
        index : str or object or a list of str, optional
            Column to use to make new frame's index. If None, uses
            existing index.

            .. versionchanged:: 1.1.0
               Also accept list of index names.

        columns : str or object or a list of str
            Column to use to make new frame's columns.

            .. versionchanged:: 1.1.0
               Also accept list of columns names.

        values : str, object or a list of the previous, optional
            Column(s) to use for populating new frame's values. If not
            specified, all remaining columns will be used and the result will
            have hierarchically indexed columns.

            .. versionchanged:: 0.23.0
               Also accept list of column names.

        Returns
        -------
        DataFrame
            Returns reshaped DataFrame.

        Raises
        ------
        ValueError:
            When there are any `index`, `columns` combinations with multiple
            values. `DataFrame.pivot_table` when you need to aggregate.

        See Also
        --------
        DataFrame.pivot_table : Generalization of pivot that can handle
            duplicate values for one index/column pair.
        DataFrame.unstack : Pivot based on the index values instead of a
            column.

        Notes
        -----
        For finer-tuned control, see hierarchical indexing documentation along
        with the related stack/unstack methods.

        Examples
        --------
        >>> df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
        ...                            'two'],
        ...                    'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
        ...                    'baz': [1, 2, 3, 4, 5, 6],
        ...                    'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
        >>> df
            foo   bar  baz  zoo
        0   one   A    1    x
        1   one   B    2    y
        2   one   C    3    z
        3   two   A    4    q
        4   two   B    5    w
        5   two   C    6    t

        >>> df.pivot(index='foo', columns='bar', values='baz')
        bar  A   B   C
        foo
        one  1   2   3
        two  4   5   6

        >>> df.pivot(index='foo', columns='bar')['baz']
        bar  A   B   C
        foo
        one  1   2   3
        two  4   5   6

        >>> df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
              baz       zoo
        bar   A  B  C   A  B  C
        foo
        one   1  2  3   x  y  z
        two   4  5  6   q  w  t

        You could also assign a list of column names or a list of index names.

        >>> df = pd.DataFrame({
        ...        "lev1": [1, 1, 1, 2, 2, 2],
        ...        "lev2": [1, 1, 2, 1, 1, 2],
        ...        "lev3": [1, 2, 1, 2, 1, 2],
        ...        "lev4": [1, 2, 3, 4, 5, 6],
        ...        "values": [0, 1, 2, 3, 4, 5]})
        >>> df
            lev1 lev2 lev3 lev4 values
        0   1    1    1    1    0
        1   1    1    2    2    1
        2   1    2    1    3    2
        3   2    1    2    4    3
        4   2    1    1    5    4
        5   2    2    2    6    5

        >>> df.pivot(index="lev1", columns=["lev2", "lev3"],values="values")
        lev2    1         2
        lev3    1    2    1    2
        lev1
        1     0.0  1.0  2.0  NaN
        2     4.0  3.0  NaN  5.0

        >>> df.pivot(index=["lev1", "lev2"], columns=["lev3"],values="values")
              lev3    1    2
        lev1  lev2
           1     1  0.0  1.0
                 2  2.0  NaN
           2     1  4.0  3.0
                 2  NaN  5.0

        A ValueError is raised if there are any duplicates.

        >>> df = pd.DataFrame({"foo": ['one', 'one', 'two', 'two'],
        ...                    "bar": ['A', 'A', 'B', 'C'],
        ...                    "baz": [1, 2, 3, 4]})
        >>> df
           foo bar  baz
        0  one   A    1
        1  one   A    2
        2  two   B    3
        3  two   C    4

        Notice that the first two rows are the same for our `index`
        and `columns` arguments.

        >>> df.pivot(index='foo', columns='bar', values='baz')
        Traceback (most recent call last):
           ...
        ValueError: Index contains duplicate entries, cannot reshape
        �pivotcCsddlm}|||||d�S)Nr)r�)r�r�r�)�pandas.core.reshape.pivotr�)r�r�r�r�r�r�r�r�r�szDataFrame.pivota�
        Create a spreadsheet-style pivot table as a DataFrame.

        The levels in the pivot table will be stored in MultiIndex objects
        (hierarchical indexes) on the index and columns of the result DataFrame.

        Parameters
        ----------%s
        values : column to aggregate, optional
        index : column, Grouper, array, or list of the previous
            If an array is passed, it must be the same length as the data. The
            list can contain any of the other types (except list).
            Keys to group by on the pivot table index.  If an array is passed,
            it is being used as the same manner as column values.
        columns : column, Grouper, array, or list of the previous
            If an array is passed, it must be the same length as the data. The
            list can contain any of the other types (except list).
            Keys to group by on the pivot table column.  If an array is passed,
            it is being used as the same manner as column values.
        aggfunc : function, list of functions, dict, default numpy.mean
            If list of functions passed, the resulting pivot table will have
            hierarchical columns whose top level are the function names
            (inferred from the function objects themselves)
            If dict is passed, the key is column to aggregate and value
            is function or list of functions.
        fill_value : scalar, default None
            Value to replace missing values with (in the resulting pivot table,
            after aggregation).
        margins : bool, default False
            Add all row / columns (e.g. for subtotal / grand totals).
        dropna : bool, default True
            Do not include columns whose entries are all NaN.
        margins_name : str, default 'All'
            Name of the row / column that will contain the totals
            when margins is True.
        observed : bool, default False
            This only applies if any of the groupers are Categoricals.
            If True: only show observed values for categorical groupers.
            If False: show all values for categorical groupers.

            .. versionchanged:: 0.25.0

        Returns
        -------
        DataFrame
            An Excel style pivot table.

        See Also
        --------
        DataFrame.pivot : Pivot without aggregation that can handle
            non-numeric data.

        Examples
        --------
        >>> df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
        ...                          "bar", "bar", "bar", "bar"],
        ...                    "B": ["one", "one", "one", "two", "two",
        ...                          "one", "one", "two", "two"],
        ...                    "C": ["small", "large", "large", "small",
        ...                          "small", "large", "small", "small",
        ...                          "large"],
        ...                    "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
        ...                    "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
        >>> df
             A    B      C  D  E
        0  foo  one  small  1  2
        1  foo  one  large  2  4
        2  foo  one  large  2  5
        3  foo  two  small  3  5
        4  foo  two  small  3  6
        5  bar  one  large  4  6
        6  bar  one  small  5  8
        7  bar  two  small  6  9
        8  bar  two  large  7  9

        This first example aggregates values by taking the sum.

        >>> table = pd.pivot_table(df, values='D', index=['A', 'B'],
        ...                     columns=['C'], aggfunc=np.sum)
        >>> table
        C        large  small
        A   B
        bar one    4.0    5.0
            two    7.0    6.0
        foo one    4.0    1.0
            two    NaN    6.0

        We can also fill missing values using the `fill_value` parameter.

        >>> table = pd.pivot_table(df, values='D', index=['A', 'B'],
        ...                     columns=['C'], aggfunc=np.sum, fill_value=0)
        >>> table
        C        large  small
        A   B
        bar one      4      5
            two      7      6
        foo one      4      1
            two      0      6

        The next example aggregates by taking the mean across multiple columns.

        >>> table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
        ...                     aggfunc={'D': np.mean,
        ...                              'E': np.mean})
        >>> table
                        D         E
        A   C
        bar large  5.500000  7.500000
            small  5.500000  8.500000
        foo large  2.000000  4.500000
            small  2.333333  4.333333

        We can also calculate multiple types of aggregations for any given
        value column.

        >>> table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
        ...                     aggfunc={'D': np.mean,
        ...                              'E': [min, max, np.mean]})
        >>> table
                        D    E
                    mean  max      mean  min
        A   C
        bar large  5.500000  9.0  7.500000  6.0
            small  5.500000  9.0  8.500000  8.0
        foo large  2.000000  5.0  4.500000  4.0
            small  2.333333  6.0  4.333333  2.0
        �pivot_table�mean�Allc
Cs(ddlm}
|
||||||||||	d�
S)Nr)r�)	r�r�r��aggfuncr��marginsr:�margins_namer�)r�r�)r�r�r�r�r�r�r�r:r�r�r�r�r�r�r��szDataFrame.pivot_tablecCs>ddlm}m}t|ttf�r,||||d�S||||d�SdS)a:
        Stack the prescribed level(s) from columns to index.

        Return a reshaped DataFrame or Series having a multi-level
        index with one or more new inner-most levels compared to the current
        DataFrame. The new inner-most levels are created by pivoting the
        columns of the current dataframe:

          - if the columns have a single level, the output is a Series;
          - if the columns have multiple levels, the new index
            level(s) is (are) taken from the prescribed level(s) and
            the output is a DataFrame.

        Parameters
        ----------
        level : int, str, list, default -1
            Level(s) to stack from the column axis onto the index
            axis, defined as one index or label, or a list of indices
            or labels.
        dropna : bool, default True
            Whether to drop rows in the resulting Frame/Series with
            missing values. Stacking a column level onto the index
            axis can create combinations of index and column values
            that are missing from the original dataframe. See Examples
            section.

        Returns
        -------
        DataFrame or Series
            Stacked dataframe or series.

        See Also
        --------
        DataFrame.unstack : Unstack prescribed level(s) from index axis
             onto column axis.
        DataFrame.pivot : Reshape dataframe from long format to wide
             format.
        DataFrame.pivot_table : Create a spreadsheet-style pivot table
             as a DataFrame.

        Notes
        -----
        The function is named by analogy with a collection of books
        being reorganized from being side by side on a horizontal
        position (the columns of the dataframe) to being stacked
        vertically on top of each other (in the index of the
        dataframe).

        Examples
        --------
        **Single level columns**

        >>> df_single_level_cols = pd.DataFrame([[0, 1], [2, 3]],
        ...                                     index=['cat', 'dog'],
        ...                                     columns=['weight', 'height'])

        Stacking a dataframe with a single level column axis returns a Series:

        >>> df_single_level_cols
             weight height
        cat       0      1
        dog       2      3
        >>> df_single_level_cols.stack()
        cat  weight    0
             height    1
        dog  weight    2
             height    3
        dtype: int64

        **Multi level columns: simple case**

        >>> multicol1 = pd.MultiIndex.from_tuples([('weight', 'kg'),
        ...                                        ('weight', 'pounds')])
        >>> df_multi_level_cols1 = pd.DataFrame([[1, 2], [2, 4]],
        ...                                     index=['cat', 'dog'],
        ...                                     columns=multicol1)

        Stacking a dataframe with a multi-level column axis:

        >>> df_multi_level_cols1
             weight
                 kg    pounds
        cat       1        2
        dog       2        4
        >>> df_multi_level_cols1.stack()
                    weight
        cat kg           1
            pounds       2
        dog kg           2
            pounds       4

        **Missing values**

        >>> multicol2 = pd.MultiIndex.from_tuples([('weight', 'kg'),
        ...                                        ('height', 'm')])
        >>> df_multi_level_cols2 = pd.DataFrame([[1.0, 2.0], [3.0, 4.0]],
        ...                                     index=['cat', 'dog'],
        ...                                     columns=multicol2)

        It is common to have missing values when stacking a dataframe
        with multi-level columns, as the stacked dataframe typically
        has more values than the original dataframe. Missing values
        are filled with NaNs:

        >>> df_multi_level_cols2
            weight height
                kg      m
        cat    1.0    2.0
        dog    3.0    4.0
        >>> df_multi_level_cols2.stack()
                height  weight
        cat kg     NaN     1.0
            m      2.0     NaN
        dog kg     NaN     3.0
            m      4.0     NaN

        **Prescribing the level(s) to be stacked**

        The first parameter controls which level or levels are stacked:

        >>> df_multi_level_cols2.stack(0)
                     kg    m
        cat height  NaN  2.0
            weight  1.0  NaN
        dog height  NaN  4.0
            weight  3.0  NaN
        >>> df_multi_level_cols2.stack([0, 1])
        cat  height  m     2.0
             weight  kg    1.0
        dog  height  m     4.0
             weight  kg    3.0
        dtype: float64

        **Dropping missing values**

        >>> df_multi_level_cols3 = pd.DataFrame([[None, 1.0], [2.0, 3.0]],
        ...                                     index=['cat', 'dog'],
        ...                                     columns=multicol2)

        Note that rows where all values are missing are dropped by
        default but this behaviour can be controlled via the dropna
        keyword parameter:

        >>> df_multi_level_cols3
            weight height
                kg      m
        cat    NaN    1.0
        dog    2.0    3.0
        >>> df_multi_level_cols3.stack(dropna=False)
                height  weight
        cat kg     NaN     NaN
            m      1.0     NaN
        dog kg     NaN     2.0
            m      3.0     NaN
        >>> df_multi_level_cols3.stack(dropna=True)
                height  weight
        cat m      1.0     NaN
        dog kg     NaN     2.0
            m      3.0     NaN
        r)�stack�stack_multiple)r:N)�pandas.core.reshape.reshaper�r�r�r�r�)r�r�r:r�r�r�r�r�r��s
"zDataFrame.stack)rhr=r�cCs�t|�pt|t�std��|jjs*td��|jdd�}|dk	sBt�||j�}|j	|gdd�j
|�}|rztjt
|��|_n|jj|j�|_|j|jdd	�}|S)
uw
        Transform each element of a list-like to a row, replicating index values.

        .. versionadded:: 0.25.0

        Parameters
        ----------
        column : str or tuple
            Column to explode.
        ignore_index : bool, default False
            If True, the resulting index will be labeled 0, 1, …, n - 1.

            .. versionadded:: 1.1.0

        Returns
        -------
        DataFrame
            Exploded lists to rows of the subset columns;
            index will be duplicated for these rows.

        Raises
        ------
        ValueError :
            if columns of the frame are not unique.

        See Also
        --------
        DataFrame.unstack : Pivot a level of the (necessarily hierarchical)
            index labels.
        DataFrame.melt : Unpivot a DataFrame from wide format to long format.
        Series.explode : Explode a DataFrame from list-like columns to long format.

        Notes
        -----
        This routine will explode list-likes including lists, tuples,
        Series, and np.ndarray. The result dtype of the subset rows will
        be object. Scalars will be returned unchanged. Empty list-likes will
        result in a np.nan for that row.

        Examples
        --------
        >>> df = pd.DataFrame({'A': [[1, 2, 3], 'foo', [], [3, 4]], 'B': 1})
        >>> df
                   A  B
        0  [1, 2, 3]  1
        1        foo  1
        2         []  1
        3     [3, 4]  1

        >>> df.explode('A')
             A  B
        0    1  1
        0    2  1
        0    3  1
        1  foo  1
        2  NaN  1
        3    3  1
        3    4  1
        zcolumn must be a scalarzcolumns must be uniqueT)rbNr�)r�F)r�r�)rPr�r�r�r�rr.r��exploderbrr�r�r�r�r$r)r�rhr=�dfr$r�r�r�r�_s>zDataFrame.explodecCsddlm}||||�S)ab
        Pivot a level of the (necessarily hierarchical) index labels.

        Returns a DataFrame having a new level of column labels whose inner-most level
        consists of the pivoted index labels.

        If the index is not a MultiIndex, the output will be a Series
        (the analogue of stack when the columns are not a MultiIndex).

        Parameters
        ----------
        level : int, str, or list of these, default -1 (last level)
            Level(s) of index to unstack, can pass level name.
        fill_value : int, str or dict
            Replace NaN with this value if the unstack produces missing values.

        Returns
        -------
        Series or DataFrame

        See Also
        --------
        DataFrame.pivot : Pivot a table based on column values.
        DataFrame.stack : Pivot a level of the column labels (inverse operation
            from `unstack`).

        Examples
        --------
        >>> index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'),
        ...                                    ('two', 'a'), ('two', 'b')])
        >>> s = pd.Series(np.arange(1.0, 5.0), index=index)
        >>> s
        one  a   1.0
             b   2.0
        two  a   3.0
             b   4.0
        dtype: float64

        >>> s.unstack(level=-1)
             a   b
        one  1.0  2.0
        two  3.0  4.0

        >>> s.unstack(level=0)
           one  two
        a  1.0   3.0
        b  2.0   4.0

        >>> df = s.unstack(level=0)
        >>> df.unstack()
        one  a  1.0
             b  2.0
        two  a  3.0
             b  4.0
        dtype: float64
        r)�unstack)r�r�)r�r�r�r�r�r�r�r��s9zDataFrame.unstackryzdf.melt(z
    .. versionadded:: 0.20.0
)Zcaller�versionaddedr!r�c	Cst|||||||d�S)N)�id_vars�
value_vars�var_name�
value_namer!r=)ry)r�r�r�r�r�r!r=r�r�r�ry�szDataFrame.meltZ	Dataframezcaxis : {0 or 'index', 1 or 'columns'}, default 0
    Take difference over rows (0) or columns (1).
a�
        Difference with previous row

        >>> df = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6],
        ...                    'b': [1, 1, 2, 3, 5, 8],
        ...                    'c': [1, 4, 9, 16, 25, 36]})
        >>> df
           a  b   c
        0  1  1   1
        1  2  1   4
        2  3  2   9
        3  4  3  16
        4  5  5  25
        5  6  8  36

        >>> df.diff()
             a    b     c
        0  NaN  NaN   NaN
        1  1.0  0.0   3.0
        2  1.0  1.0   5.0
        3  1.0  1.0   7.0
        4  1.0  2.0   9.0
        5  1.0  3.0  11.0

        Difference with previous column

        >>> df.diff(axis=1)
            a    b     c
        0 NaN  0.0   0.0
        1 NaN -1.0   3.0
        2 NaN -1.0   7.0
        3 NaN -1.0  13.0
        4 NaN  0.0  20.0
        5 NaN  2.0  28.0

        Difference with 3rd previous row

        >>> df.diff(periods=3)
             a    b     c
        0  NaN  NaN   NaN
        1  NaN  NaN   NaN
        2  NaN  NaN   NaN
        3  3.0  2.0  15.0
        4  3.0  4.0  21.0
        5  3.0  6.0  27.0

        Difference with following row

        >>> df.diff(periods=-1)
             a    b     c
        0 -1.0  0.0  -3.0
        1 -1.0 -1.0  -5.0
        2 -1.0 -1.0  -7.0
        3 -1.0 -2.0  -9.0
        4 -1.0 -3.0 -11.0
        5  NaN  NaN   NaN

        Overflow in input dtype

        >>> df = pd.DataFrame({'a': [1, 0]}, dtype=np.uint8)
        >>> df.diff()
               a
        0    NaN
        1  255.0)r�Zextra_paramsZother_klassr�)rr�r�cCsN|j|�}|j�|dkr4|dkr4|jj|dd�jS|jj||d�}|j|�S)Nr)r�)rmr�)rWr0r&rIr�r�)r�rr�Zbm_axisrYr�r�r�rIsK
zDataFrame.diff)r�r�r6r�cCs$|dkr|}n|jdkr|S||S)a
        Sub-classes to define. Return a sliced object.

        Parameters
        ----------
        key : string / list of selections
        ndim : 1,2
            requested ndim of result
        subset : object, default None
            subset to act on
        Nr�)r�)r�r�r�r6r�r�r��_gotitembs

zDataFrame._gotitema�
    The aggregation operations are always performed over an axis, either the
    index (default) or the column axis. This behavior is different from
    `numpy` aggregation functions (`mean`, `median`, `prod`, `sum`, `std`,
    `var`), where the default is to compute the aggregation of the flattened
    array, e.g., ``numpy.mean(arr_2d)`` as opposed to
    ``numpy.mean(arr_2d, axis=0)``.

    `agg` is an alias for `aggregate`. Use the alias.

    See Also
    --------
    DataFrame.apply : Perform any type of operations.
    DataFrame.transform : Perform transformation type operations.
    core.groupby.GroupBy : Perform operations over groups.
    core.resample.Resampler : Perform operations over resampled bins.
    core.window.Rolling : Perform operations over rolling window.
    core.window.Expanding : Perform operations over expanding window.
    core.window.ExponentialMovingWindow : Perform operation over exponential weighted
        window.
    a�
    Examples
    --------
    >>> df = pd.DataFrame([[1, 2, 3],
    ...                    [4, 5, 6],
    ...                    [7, 8, 9],
    ...                    [np.nan, np.nan, np.nan]],
    ...                   columns=['A', 'B', 'C'])

    Aggregate these functions over the rows.

    >>> df.agg(['sum', 'min'])
            A     B     C
    sum  12.0  15.0  18.0
    min   1.0   2.0   3.0

    Different aggregations per column.

    >>> df.agg({'A' : ['sum', 'min'], 'B' : ['min', 'max']})
            A    B
    max   NaN  8.0
    min   1.0  2.0
    sum  12.0  NaN

    Aggregate over the columns.

    >>> df.agg("mean", axis="columns")
    0    2.0
    1    5.0
    2    8.0
    3    NaN
    dtype: float64
    �	aggregater�z
.. versionadded:: 0.20.0
)r�r�Zsee_alsor�r�c
Os�|j|�}t|f|�\}}}}d}y|j||f|�|�\}}	Wn6tk
rt}
ztd|
���}||
�WYdd}
~
XnX|dkr�|j|f||d�|��S|r�t||||�}t||d�}|S)Nz?DataFrame constructor called with incompatible data and dtype: )r�r�)r�)r2r\�
_aggregater��applyr]r�)
r�r/r�r�r�Z
relabelingr�rgr$r5r�r�Zresult_in_dictr�r�r�r��s	
zDataFrame.aggregatecsN|dkr:|jj|f|�|�\}}|dk	r.|jn|}||fSt�j|f|�|�S)Nr�)r&r�r)r��argr�r�r�r$r5)rr�r�r��s
zDataFrame._aggregate)r�r�cs<|j|�}|dkr(|jj|f|�|�jSt�j|f|�|�S)Nr�)r2r&�	transformr)r�r/r�r�r�)rr�r�r��s
zDataFrame.transformc		Ks*ddlm}||||||||d�}|j�S)a.
        Apply a function along an axis of the DataFrame.

        Objects passed to the function are Series objects whose index is
        either the DataFrame's index (``axis=0``) or the DataFrame's columns
        (``axis=1``). By default (``result_type=None``), the final return type
        is inferred from the return type of the applied function. Otherwise,
        it depends on the `result_type` argument.

        Parameters
        ----------
        func : function
            Function to apply to each column or row.
        axis : {0 or 'index', 1 or 'columns'}, default 0
            Axis along which the function is applied:

            * 0 or 'index': apply function to each column.
            * 1 or 'columns': apply function to each row.

        raw : bool, default False
            Determines if row or column is passed as a Series or ndarray object:

            * ``False`` : passes each row or column as a Series to the
              function.
            * ``True`` : the passed function will receive ndarray objects
              instead.
              If you are just applying a NumPy reduction function this will
              achieve much better performance.

        result_type : {'expand', 'reduce', 'broadcast', None}, default None
            These only act when ``axis=1`` (columns):

            * 'expand' : list-like results will be turned into columns.
            * 'reduce' : returns a Series if possible rather than expanding
              list-like results. This is the opposite of 'expand'.
            * 'broadcast' : results will be broadcast to the original shape
              of the DataFrame, the original index and columns will be
              retained.

            The default behaviour (None) depends on the return value of the
            applied function: list-like results will be returned as a Series
            of those. However if the apply function returns a Series these
            are expanded to columns.

            .. versionadded:: 0.23.0

        args : tuple
            Positional arguments to pass to `func` in addition to the
            array/series.
        **kwds
            Additional keyword arguments to pass as keywords arguments to
            `func`.

        Returns
        -------
        Series or DataFrame
            Result of applying ``func`` along the given axis of the
            DataFrame.

        See Also
        --------
        DataFrame.applymap: For elementwise operations.
        DataFrame.aggregate: Only perform aggregating type operations.
        DataFrame.transform: Only perform transforming type operations.

        Examples
        --------
        >>> df = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
        >>> df
           A  B
        0  4  9
        1  4  9
        2  4  9

        Using a numpy universal function (in this case the same as
        ``np.sqrt(df)``):

        >>> df.apply(np.sqrt)
             A    B
        0  2.0  3.0
        1  2.0  3.0
        2  2.0  3.0

        Using a reducing function on either axis

        >>> df.apply(np.sum, axis=0)
        A    12
        B    27
        dtype: int64

        >>> df.apply(np.sum, axis=1)
        0    13
        1    13
        2    13
        dtype: int64

        Returning a list-like will result in a Series

        >>> df.apply(lambda x: [1, 2], axis=1)
        0    [1, 2]
        1    [1, 2]
        2    [1, 2]
        dtype: object

        Passing ``result_type='expand'`` will expand list-like results
        to columns of a Dataframe

        >>> df.apply(lambda x: [1, 2], axis=1, result_type='expand')
           0  1
        0  1  2
        1  1  2
        2  1  2

        Returning a Series inside the function is similar to passing
        ``result_type='expand'``. The resulting column names
        will be the Series index.

        >>> df.apply(lambda x: pd.Series([1, 2], index=['foo', 'bar']), axis=1)
           foo  bar
        0    1    2
        1    1    2
        2    1    2

        Passing ``result_type='broadcast'`` will ensure the same shape
        result, whether list-like or scalar is returned by the function,
        and broadcast it along the axis. The resulting column names will
        be the originals.

        >>> df.apply(lambda x: [1, 2], axis=1, result_type='broadcast')
           A  B
        0  1  2
        1  1  2
        2  1  2
        r)�frame_apply)r/r��raw�result_typer��kwds)�pandas.core.applyr��
get_result)	r�r/r�r�r�r�r�r��opr�r�r�r��szDataFrame.applycs�fdd�}|j|�S)a�
        Apply a function to a Dataframe elementwise.

        This method applies a function that accepts and returns a scalar
        to every element of a DataFrame.

        Parameters
        ----------
        func : callable
            Python function, returns a single value from a single value.

        Returns
        -------
        DataFrame
            Transformed DataFrame.

        See Also
        --------
        DataFrame.apply : Apply a function along input axis of DataFrame.

        Examples
        --------
        >>> df = pd.DataFrame([[1, 2.12], [3.356, 4.567]])
        >>> df
               0      1
        0  1.000  2.120
        1  3.356  4.567

        >>> df.applymap(lambda x: len(str(x)))
           0  1
        0  3  4
        1  5  5

        Note that a vectorized version of `func` often exists, which will
        be much faster. You could square each number elementwise.

        >>> df.applymap(lambda x: x**2)
                   0          1
        0   1.000000   4.494400
        1  11.262736  20.857489

        But it's better to avoid applymap in that case.

        >>> df ** 2
                   0          1
        0   1.000000   4.494400
        1  11.262736  20.857489
        cs&|jrtj|��Stj|jt�j��S)N)r�rZ	map_inferr��objectr)rV)r/r�r�rv�sz!DataFrame.applymap.<locals>.infer)r�)r�r/rvr�)r/r��applymap�s2zDataFrame.applymapc
Cslt|ttf�r�t|t�r,|s$td��t|�}|jdkrD|rDtd��t|jg|jjd�}|jj|j�}y|jj	|�}Wn&tk
r�|jj
t�j	|�}YnX|j|dd�j
�jj�j|jdd�}|jj|�s�|j|d�}nPt|t��r,|s�n>t|dt��s,t|�}|jj|j�dkj��r,|j|jd�}dd	lm}t|ttf��rT|f|�}	n||g}	||	|||d
�S)u


        Append rows of `other` to the end of caller, returning a new object.

        Columns in `other` that are not in the caller are added as new columns.

        Parameters
        ----------
        other : DataFrame or Series/dict-like object, or list of these
            The data to append.
        ignore_index : bool, default False
            If True, the resulting axis will be labeled 0, 1, …, n - 1.
        verify_integrity : bool, default False
            If True, raise ValueError on creating index with duplicates.
        sort : bool, default False
            Sort columns if the columns of `self` and `other` are not aligned.

            .. versionadded:: 0.23.0
            .. versionchanged:: 1.0.0

                Changed to not sort by default.

        Returns
        -------
        DataFrame

        See Also
        --------
        concat : General function to concatenate DataFrame or Series objects.

        Notes
        -----
        If a list of dict/series is passed and the keys are all contained in
        the DataFrame's index, the order of the columns in the resulting
        DataFrame will be unchanged.

        Iteratively appending rows to a DataFrame can be more computationally
        intensive than a single concatenate. A better solution is to append
        those rows to a list and then concatenate the list with the original
        DataFrame all at once.

        Examples
        --------
        >>> df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
        >>> df
           A  B
        0  1  2
        1  3  4
        >>> df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))
        >>> df.append(df2)
           A  B
        0  1  2
        1  3  4
        0  5  6
        1  7  8

        With `ignore_index` set to True:

        >>> df.append(df2, ignore_index=True)
           A  B
        0  1  2
        1  3  4
        2  5  6
        3  7  8

        The following, while not recommended methods for generating DataFrames,
        show two ways to generate a DataFrame from multiple data sources.

        Less efficient:

        >>> df = pd.DataFrame(columns=['A'])
        >>> for i in range(5):
        ...     df = df.append({'A': i}, ignore_index=True)
        >>> df
           A
        0  0
        1  1
        2  2
        3  3
        4  4

        More efficient:

        >>> pd.concat([pd.DataFrame([i], columns=['A']) for i in range(5)],
        ...           ignore_index=True)
           A
        0  0
        1  1
        2  2
        3  3
        4  4
        z+Can only append a dict if ignore_index=TrueNzICan only append a Series if ignore_index=True or if the Series has a name)r�F)r�)r�r)�concat)r=rqrE)r�rzr�r�r�rfr�rHr�rr�r�rZto_framer&Z
infer_objectsZrename_axisr�r�r�r�r�r#�pandas.core.reshape.concatr�r�)
r�r!r=rqrEr��idx_diffZcombined_columnsr�Z	to_concatr�r�r�r�sD^
zDataFrame.appendcCs|j||||||d�S)a�
        Join columns of another DataFrame.

        Join columns with `other` DataFrame either on index or on a key
        column. Efficiently join multiple DataFrame objects by index at once by
        passing a list.

        Parameters
        ----------
        other : DataFrame, Series, or list of DataFrame
            Index should be similar to one of the columns in this one. If a
            Series is passed, its name attribute must be set, and that will be
            used as the column name in the resulting joined DataFrame.
        on : str, list of str, or array-like, optional
            Column or index level name(s) in the caller to join on the index
            in `other`, otherwise joins index-on-index. If multiple
            values given, the `other` DataFrame must have a MultiIndex. Can
            pass an array as the join key if it is not already contained in
            the calling DataFrame. Like an Excel VLOOKUP operation.
        how : {'left', 'right', 'outer', 'inner'}, default 'left'
            How to handle the operation of the two objects.

            * left: use calling frame's index (or column if on is specified)
            * right: use `other`'s index.
            * outer: form union of calling frame's index (or column if on is
              specified) with `other`'s index, and sort it.
              lexicographically.
            * inner: form intersection of calling frame's index (or column if
              on is specified) with `other`'s index, preserving the order
              of the calling's one.
        lsuffix : str, default ''
            Suffix to use from left frame's overlapping columns.
        rsuffix : str, default ''
            Suffix to use from right frame's overlapping columns.
        sort : bool, default False
            Order result DataFrame lexicographically by the join key. If False,
            the order of the join key depends on the join type (how keyword).

        Returns
        -------
        DataFrame
            A dataframe containing columns from both the caller and `other`.

        See Also
        --------
        DataFrame.merge : For column(s)-on-columns(s) operations.

        Notes
        -----
        Parameters `on`, `lsuffix`, and `rsuffix` are not supported when
        passing a list of `DataFrame` objects.

        Support for specifying index levels as the `on` parameter was added
        in version 0.23.0.

        Examples
        --------
        >>> df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],
        ...                    'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})

        >>> df
          key   A
        0  K0  A0
        1  K1  A1
        2  K2  A2
        3  K3  A3
        4  K4  A4
        5  K5  A5

        >>> other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],
        ...                       'B': ['B0', 'B1', 'B2']})

        >>> other
          key   B
        0  K0  B0
        1  K1  B1
        2  K2  B2

        Join DataFrames using their indexes.

        >>> df.join(other, lsuffix='_caller', rsuffix='_other')
          key_caller   A key_other    B
        0         K0  A0        K0   B0
        1         K1  A1        K1   B1
        2         K2  A2        K2   B2
        3         K3  A3       NaN  NaN
        4         K4  A4       NaN  NaN
        5         K5  A5       NaN  NaN

        If we want to join using the key columns, we need to set key to be
        the index in both `df` and `other`. The joined DataFrame will have
        key as its index.

        >>> df.set_index('key').join(other.set_index('key'))
              A    B
        key
        K0   A0   B0
        K1   A1   B1
        K2   A2   B2
        K3   A3  NaN
        K4   A4  NaN
        K5   A5  NaN

        Another option to join using the key columns is to use the `on`
        parameter. DataFrame.join always uses `other`'s index but we can use
        any column in `df`. This method preserves the original DataFrame's
        index in the result.

        >>> df.join(other.set_index('key'), on='key')
          key   A    B
        0  K0  A0   B0
        1  K1  A1   B1
        2  K2  A2   B2
        3  K3  A3  NaN
        4  K4  A4  NaN
        5  K5  A5  NaN
        )�onr5�lsuffix�rsuffixrE)�_join_compat)r�r!r�r5r�r�rEr�r�r�rJsxzDataFrame.joinc
Csddlm}ddlm}t|t�rB|jdkr4td��t|j|i�}t|t�rl||||||dkd||f|d�S|dk	r|td��|gt	|�}	t
dd	�|	D��}
|
r�|d
kr�||	ddd|d
�}|j|jdd�S||	d|d|d
�S|	d}x&|	dd�D]}
|||
|ddd�}q�W|SdS)Nr)r�)�mergezOther Series must have a nameT)�left_onr5�
left_index�right_index�suffixesrEz?Joining multiple DataFrames only supported for joining on indexcss|]}|jjVqdS)N)r�r)r�r�r�r�r�r��sz)DataFrame._join_compat.<locals>.<genexpr>r"r�r)r�rrqrEF)r�)r5r�r�)
r�r��pandas.core.reshape.merger�r�rzr�r�r�r�r#rr�)r�r!r�r5r�r�rEr�r��framesZ
can_concatr�Zjoinedr�r�r�r�r��sB


zDataFrame._join_compat)�indents�inner�_x�_yc
Cs.ddlm}
|
||||||||||	|
||d�
S)Nr)r�)r5r�r��right_onr�r�rEr�r��	indicator�validate)r�r�)r�r#r5r�r�r�r�r�rEr�r�r�r�r�r�r�r�r��szDataFrame.mergecs�ddlm}�fdd�}dd��tj||�t�ttf�rdt�t�rT�jjsTt	d��t
||���}n*t��r���fdd	�|j�D�}nt
d
��t|�dkr�|j||dd�|j|jd
�S|SdS)a�	
        Round a DataFrame to a variable number of decimal places.

        Parameters
        ----------
        decimals : int, dict, Series
            Number of decimal places to round each column to. If an int is
            given, round each column to the same number of places.
            Otherwise dict and Series round to variable numbers of places.
            Column names should be in the keys if `decimals` is a
            dict-like, or in the index if `decimals` is a Series. Any
            columns not included in `decimals` will be left as is. Elements
            of `decimals` which are not columns of the input will be
            ignored.
        *args
            Additional keywords have no effect but might be accepted for
            compatibility with numpy.
        **kwargs
            Additional keywords have no effect but might be accepted for
            compatibility with numpy.

        Returns
        -------
        DataFrame
            A DataFrame with the affected columns rounded to the specified
            number of decimal places.

        See Also
        --------
        numpy.around : Round a numpy array to the given number of decimals.
        Series.round : Round a Series to the given number of decimals.

        Examples
        --------
        >>> df = pd.DataFrame([(.21, .32), (.01, .67), (.66, .03), (.21, .18)],
        ...                   columns=['dogs', 'cats'])
        >>> df
            dogs  cats
        0  0.21  0.32
        1  0.01  0.67
        2  0.66  0.03
        3  0.21  0.18

        By providing an integer each column is rounded to the same number
        of decimal places

        >>> df.round(1)
            dogs  cats
        0   0.2   0.3
        1   0.0   0.7
        2   0.7   0.0
        3   0.2   0.2

        With a dict, the number of places for specific columns can be
        specified with the column names as key and the number of decimal
        places as value

        >>> df.round({'dogs': 1, 'cats': 0})
            dogs  cats
        0   0.2   0.0
        1   0.0   1.0
        2   0.7   0.0
        3   0.2   0.0

        Using a Series, the number of places for specific columns can be
        specified with the column names as index and the number of
        decimal places as value

        >>> decimals = pd.Series([0, 1], index=['cats', 'dogs'])
        >>> df.round(decimals)
            dogs  cats
        0   0.2   0.0
        1   0.0   1.0
        2   0.7   0.0
        3   0.2   0.0
        r)r�c3sJxD|j�D]8\}}y�|||�VWq
tk
r@|VYq
Xq
WdS)N)rr`)r��decimalsrWrC)�
_series_roundr�r��_dict_roundms
z$DataFrame.round.<locals>._dict_roundcSst|�st|�r|j|�S|S)N)rKrH�round)rr�r�r�r�r�ts
z&DataFrame.round.<locals>._series_roundz Index of decimals must be uniquecsg|]\}}�|���qSr�r�)r�r�r)r�r�r�r�r��sz#DataFrame.round.<locals>.<listcomp>z4decimals must be an integer, a dict-like or a Seriesr�)r�)r�r�N)r�r�r�Zvalidate_roundr�r�rzr�rr�r�rJrr�r�r�r�)r�r�r�r�r�r�Znew_colsr�)r�r�r�r�s M
zDataFrame.round�pearsoncCs||j�}|j}|j�}|jttjdd�}|dkrBtj||d�}�n*|dkr\tj	||d�}�n|dksnt
|��r\|dkrzd}|j}tj
|�}t|�}	tj|	|	ftd	�}tj|�}
x�t|�D]�\}}x�t|�D]�\}
}||
kr�q�|
||
|
@}|j�|k�rtj}n8||
k�rd
}n(|j��s0|||||�}n
|||�}||||
f<|||
|f<q�Wq�Wntd|�d���|j|||d
�S)aZ
        Compute pairwise correlation of columns, excluding NA/null values.

        Parameters
        ----------
        method : {'pearson', 'kendall', 'spearman'} or callable
            Method of correlation:

            * pearson : standard correlation coefficient
            * kendall : Kendall Tau correlation coefficient
            * spearman : Spearman rank correlation
            * callable: callable with input two 1d ndarrays
                and returning a float. Note that the returned matrix from corr
                will have 1 along the diagonals and will be symmetric
                regardless of the callable's behavior.

                .. versionadded:: 0.24.0

        min_periods : int, optional
            Minimum number of observations required per pair of columns
            to have a valid result. Currently only available for Pearson
            and Spearman correlation.

        Returns
        -------
        DataFrame
            Correlation matrix.

        See Also
        --------
        DataFrame.corrwith : Compute pairwise correlation with another
            DataFrame or Series.
        Series.corr : Compute the correlation between two Series.

        Examples
        --------
        >>> def histogram_intersection(a, b):
        ...     v = np.minimum(a, b).sum().round(decimals=1)
        ...     return v
        >>> df = pd.DataFrame([(.2, .3), (.0, .6), (.6, .0), (.2, .1)],
        ...                   columns=['dogs', 'cats'])
        >>> df.corr(method=histogram_intersection)
              dogs  cats
        dogs   1.0   0.3
        cats   0.3   1.0
        F)r�r/r�r�)�minp�spearman�kendallNr�)r�g�?zHmethod must be either 'pearson', 'spearman', 'kendall', or a callable, 'z' was supplied)r�r�)�_get_numeric_datar�r�r2�floatr�r��libalgos�nancorrZnancorr_spearman�callabler&rYZ
get_corr_funcr�r��isfiniterr^r#r�r�)r�r��min_periods�
numeric_dfr�r��mat�correlZcorrf�Kr�r�acreZbcZvalidrgr�r�r��corr�sB/




zDataFrame.corr)r��ddofr�cCs�|j�}|j}|j�}|jttjdd�}t|�j�r�|dk	rp|t	|�krptj
|jd|jdf�}|jtj�ntj
|j|d�}|jt	|�t	|�f�}ntj|d|d�}|j|||d�S)	a�
        Compute pairwise covariance of columns, excluding NA/null values.

        Compute the pairwise covariance among the series of a DataFrame.
        The returned data frame is the `covariance matrix
        <https://en.wikipedia.org/wiki/Covariance_matrix>`__ of the columns
        of the DataFrame.

        Both NA and null values are automatically excluded from the
        calculation. (See the note below about bias from missing values.)
        A threshold can be set for the minimum number of
        observations for each value created. Comparisons with observations
        below this threshold will be returned as ``NaN``.

        This method is generally used for the analysis of time series data to
        understand the relationship between different measures
        across time.

        Parameters
        ----------
        min_periods : int, optional
            Minimum number of observations required per pair of columns
            to have a valid result.

        ddof : int, default 1
            Delta degrees of freedom.  The divisor used in calculations
            is ``N - ddof``, where ``N`` represents the number of elements.

            .. versionadded:: 1.1.0

        Returns
        -------
        DataFrame
            The covariance matrix of the series of the DataFrame.

        See Also
        --------
        Series.cov : Compute covariance with another Series.
        core.window.ExponentialMovingWindow.cov: Exponential weighted sample covariance.
        core.window.Expanding.cov : Expanding sample covariance.
        core.window.Rolling.cov : Rolling sample covariance.

        Notes
        -----
        Returns the covariance matrix of the DataFrame's time series.
        The covariance is normalized by N-ddof.

        For DataFrames that have Series that are missing data (assuming that
        data is `missing at random
        <https://en.wikipedia.org/wiki/Missing_data#Missing_at_random>`__)
        the returned covariance matrix will be an unbiased estimate
        of the variance and covariance between the member Series.

        However, for many applications this estimate may not be acceptable
        because the estimate covariance matrix is not guaranteed to be positive
        semi-definite. This could lead to estimate correlations having
        absolute values which are greater than one, and/or a non-invertible
        covariance matrix. See `Estimation of covariance matrices
        <https://en.wikipedia.org/w/index.php?title=Estimation_of_covariance_
        matrices>`__ for more details.

        Examples
        --------
        >>> df = pd.DataFrame([(1, 2), (0, 3), (2, 0), (1, 1)],
        ...                   columns=['dogs', 'cats'])
        >>> df.cov()
                  dogs      cats
        dogs  0.666667 -1.000000
        cats -1.000000  1.666667

        >>> np.random.seed(42)
        >>> df = pd.DataFrame(np.random.randn(1000, 5),
        ...                   columns=['a', 'b', 'c', 'd', 'e'])
        >>> df.cov()
                  a         b         c         d         e
        a  0.998438 -0.020161  0.059277 -0.008943  0.014144
        b -0.020161  1.059352 -0.008543 -0.024738  0.009826
        c  0.059277 -0.008543  1.010670 -0.001486 -0.000271
        d -0.008943 -0.024738 -0.001486  0.921297 -0.013692
        e  0.014144  0.009826 -0.000271 -0.013692  0.977795

        **Minimum number of periods**

        This method also supports an optional ``min_periods`` keyword
        that specifies the required minimum number of non-NA observations for
        each column pair in order to have a valid result:

        >>> np.random.seed(42)
        >>> df = pd.DataFrame(np.random.randn(20, 3),
        ...                   columns=['a', 'b', 'c'])
        >>> df.loc[df.index[:5], 'a'] = np.nan
        >>> df.loc[df.index[5:10], 'b'] = np.nan
        >>> df.cov(min_periods=12)
                  a         b         c
        a  0.316741       NaN -0.150812
        b       NaN  1.248003  0.191417
        c -0.150812  0.191417  0.895202
        F)r�r/r�Nr�)r�T)�covr�)r�r�)r�r�r�r2r�r�r�rVr#r�r�r��fillr�r&�reshaper�r�r�)r�r�r�r�r�r�r�Zbase_covr�r�r�r��sez
DataFrame.covcs�|j|�}|j�}t�t�r4|j��fdd�|d�S�j��|j�ddd�\}}|dkrd|j}|j}�dkr�||d	}||d	}||j�}||j�}	||	j�}
|j	�d|j
�|j
�}|
|}nT�dks�t���r�fdd
�}
|jt
|
t|jj|jj��|jd�}ntd��d���|�s�|d	k�r6dnd	}|j|�j�j|��}|j|j�}t|�d	k�r�|jttjgt|�|d��}|S)a�
        Compute pairwise correlation.

        Pairwise correlation is computed between rows or columns of
        DataFrame with rows or columns of Series or DataFrame. DataFrames
        are first aligned along both axes before computing the
        correlations.

        Parameters
        ----------
        other : DataFrame, Series
            Object with which to compute correlations.
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to use. 0 or 'index' to compute column-wise, 1 or 'columns' for
            row-wise.
        drop : bool, default False
            Drop missing indices from result.
        method : {'pearson', 'kendall', 'spearman'} or callable
            Method of correlation:

            * pearson : standard correlation coefficient
            * kendall : Kendall Tau correlation coefficient
            * spearman : Spearman rank correlation
            * callable: callable with input two 1d ndarrays
                and returning a float.

            .. versionadded:: 0.24.0

        Returns
        -------
        Series
            Pairwise correlations.

        See Also
        --------
        DataFrame.corr : Compute pairwise correlation of columns.
        cs�j|�d�S)N)r�)r�)rV)r�r!r�r��<lambda>� sz$DataFrame.corrwith.<locals>.<lambda>)r�r�F)rr�r�r�rr�r�cstj|d|d�d�S)Nrr�)r�)rYr�)rV)r�r�r�rg� szDataFrame.corrwith.<locals>.c)r�zInvalid method zM was passed, valid methods are: 'pearson', 'kendall', 'spearman', or callable)r�r�)r2r�r�rzr�rr&r�r^rlZstdr�r	rr
r�r�r�r3rrHr�r�rr�r�)r�r!r�rbr�rqr"r#ZldemZrdem�num�domr�rgZraxisrdr�r�)r�r!r��corrwith_ s<&


"zDataFrame.corrwithcCs�|j|�}|dk	r"|j|||d�S|r0|j�}n|}t|j|��dkr\|jd|j|�d�}nJ|jsj|jj	r|t
|�j|d�}n*t
|�j|d�}|j}|j||j|�d�}|j
d�S)a	
        Count non-NA cells for each column or row.

        The values `None`, `NaN`, `NaT`, and optionally `numpy.inf` (depending
        on `pandas.options.mode.use_inf_as_na`) are considered NA.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            If 0 or 'index' counts are generated for each column.
            If 1 or 'columns' counts are generated for each row.
        level : int or str, optional
            If the axis is a `MultiIndex` (hierarchical), count along a
            particular `level`, collapsing into a `DataFrame`.
            A `str` specifies the level name.
        numeric_only : bool, default False
            Include only `float`, `int` or `boolean` data.

        Returns
        -------
        Series or DataFrame
            For each column/row the number of non-NA/null entries.
            If `level` is specified returns a `DataFrame`.

        See Also
        --------
        Series.count: Number of non-NA elements in a Series.
        DataFrame.shape: Number of DataFrame rows and columns (including NA
            elements).
        DataFrame.isna: Boolean same-sized DataFrame showing places of NA
            elements.

        Examples
        --------
        Constructing DataFrame from a dictionary:

        >>> df = pd.DataFrame({"Person":
        ...                    ["John", "Myla", "Lewis", "John", "Myla"],
        ...                    "Age": [24., np.nan, 21., 33, 26],
        ...                    "Single": [False, True, True, True, False]})
        >>> df
           Person   Age  Single
        0    John  24.0   False
        1    Myla   NaN    True
        2   Lewis  21.0    True
        3    John  33.0    True
        4    Myla  26.0   False

        Notice the uncounted NA values:

        >>> df.count()
        Person    5
        Age       4
        Single    5
        dtype: int64

        Counts for each **row**:

        >>> df.count(axis='columns')
        0    3
        1    2
        2    3
        3    3
        4    3
        dtype: int64

        Counts for one level of a `MultiIndex`:

        >>> df.set_index(["Person", "Single"]).count(level="Person")
                Age
        Person
        John      2
        Lewis     1
        Myla      1
        N)r��numeric_onlyr)r�)r�Zint64)r2�_count_levelr�r�r3r	�
_get_agg_axisr�r�r�rVr^r�r�)r�r�r�r�r�r$Z
series_countsr_r�r�r�rl� sL

zDataFrame.countcCs|r|j�}n|}|j|�}|j|�}t|t�sFtd|j|��d���|jrXt|�j	}n
t|j	�}t|j
|d��}|dkr�||@}	n|jdd�|@}	t|t�r�|j
|�}|j|}
|j|j|
d�}t|j|�}tj|	|t|�|d�}
|dk�r|j|
||d�}n|j|
||d�}|S)	Nz&Can only count levels on hierarchical r�)r�r�)r�)r�)r�r�r�)r�r3r�r�rjr��_get_axis_namer�rVr�Zget_level_valuesr�r�r'Z_namesr)Z
_shallow_copyr?r*rZcount_level_2dr�r�)r�r�r�r�r�Z
count_axisr7Zvalues_maskZ
index_maskr�Z
level_nameZlevel_indexZlevel_codesr_r$r�r�r�r�&!s2









zDataFrame._count_level)r�cs��dks�dkst���tjdd��j�D�td�}|dkrp�dkrp|j�rptjdtdd	��j	|}	�|	��dkr��dkr�d}
d}n�j
����j��}
�j}�����fd
d�}���fdd
�}
|dk	oֈdk�rȈ}|dkr�|
dd�}�dk�r|j
}d��dk�rdnd}����fdd�}|jj|�}t|t��sDt�t|��rxt|�tt|j���dk�sxt|j���|j|tt|��|d�}|j	|_�dk�r�t|j��r�t|j|j�|dd�<|S�j�s܈jj�rD�dk�rD�dk�rDt��dk�rDddl m!}|�|ddd�}|j"�}|j#�j#k�r@|j$dj%d�}|S|dk�r��}|j}y||�}WnPt&k
�r�|
dd�}|j��}
|j}tj'dd��||�}WdQRXYnXn4|�r�|
dd�}|j��}
|j}n
�}|j}||�}�dk�rt|��r�dk�rtj(|�}n�t)|d��r�t|j��r�y@�dk�rH|j*tj+�}n$�dk�rlt,|�j-��rl|j*tj(�}Wn0t.t&fk
�r��dk�r�t||j�}YnX|dk	�r��j||
d�}|S)Nr�cSsg|]}t|j��qSr�)rDr�)r�r�r�r�r�r�b!sz%DataFrame._reduce.<locals>.<listcomp>)r�r��medianz�DataFrame.mean and DataFrame.median with numeric_only=None will include datetime64 and datetime64tz columns in a future version.r�)r3cs>t|j�r$t|�j�fd�i���S�|f��d����SdS)N�skipna)r�r�)rGr�rb�_reduce)r�)r�r�r�r�r�r�r�r/{!s
zDataFrame._reduce.<locals>.funccsV�dkr�j�}n@�dkr>|r4�dkr.�j�n�}qR�j�}nd��d�}t|��|S)Nr�rz.Generating numeric_only data with filter_type z not supported.)r�Z_get_bool_datar�)�axis_mattersr�ro)r��filter_typer�r�r��	_get_data�!s

z$DataFrame._reduce.<locals>._get_datarr�T)r�cs:t|t�r |j�fd�i���S�|fd�d����SdS)Nr�r�)r�r�)r�r_r�)r�)r�r�r�r�r�r��blk_func�!s
z#DataFrame._reduce.<locals>.blk_func)r�r�)r��expand)r/r�Zignore_failuresFr)r#r�)r�)r�r�)rr�)/r�r�r�r�r�r�r@rArDr�r2r�r�r&r��reducer�r�r�r�r�r+r	r�r�rOr�r3r�r�r�r�r�r�r�r�r�rr�r}Zbool_rr��float64rVr#r�)r�r�r�r�r�r�r�r�Zdtype_is_dtr�r
r�r/r�r�Z	out_dtyper�r�rjr�Zopar$r�r�r�)r�r�r�r�r�r�r�r�r�S!s�





*"







zDataFrame._reducecCs|jtj||d�S)a�
        Count distinct observations over requested axis.

        Return Series with number of distinct observations. Can ignore NaN
        values.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for
            column-wise.
        dropna : bool, default True
            Don't include NaN in the counts.

        Returns
        -------
        Series

        See Also
        --------
        Series.nunique: Method nunique for Series.
        DataFrame.count: Count non-NA cells for each column or row.

        Examples
        --------
        >>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [1, 1, 1]})
        >>> df.nunique()
        A    3
        B    1
        dtype: int64

        >>> df.nunique(axis=1)
        0    1
        1    2
        2    2
        dtype: int64
        )r�r:)r�rz�nunique)r�r�r:r�r�r�r��!s&zDataFrame.nuniquecs\|j|�}tj|j||d�}t|tj�s,t�|j|���fdd�|D�}|j	||j
|�d�S)a1
        Return index of first occurrence of minimum over requested axis.

        NA/null values are excluded.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for column-wise.
        skipna : bool, default True
            Exclude NA/null values. If an entire row/column is NA, the result
            will be NA.

        Returns
        -------
        Series
            Indexes of minima along the specified axis.

        Raises
        ------
        ValueError
            * If the row/column is empty

        See Also
        --------
        Series.idxmin : Return index of the minimum element.

        Notes
        -----
        This method is the DataFrame version of ``ndarray.argmin``.

        Examples
        --------
        Consider a dataset containing food consumption in Argentina.

        >>> df = pd.DataFrame({'consumption': [10.51, 103.11, 55.48],
        ...                    'co2_emissions': [37.2, 19.66, 1712]},
        ...                    index=['Pork', 'Wheat Products', 'Beef'])

        >>> df
                        consumption  co2_emissions
        Pork                  10.51         37.20
        Wheat Products       103.11         19.66
        Beef                  55.48       1712.00

        By default, it returns the index for the minimum value in each column.

        >>> df.idxmin()
        consumption                Pork
        co2_emissions    Wheat Products
        dtype: object

        To return the index for the minimum value in each row, use ``axis="columns"``.

        >>> df.idxmin(axis="columns")
        Pork                consumption
        Wheat Products    co2_emissions
        Beef                consumption
        dtype: object
        )r�r�cs"g|]}|dkr�|ntj�qS)r)r�r�)r�r)r�r�r�r�g"sz$DataFrame.idxmin.<locals>.<listcomp>)r�)r2rYZ	nanargminr�r�r�r�r�r3r	r�)r�r�r�r8r$r�)r�r��idxmin!"s=

zDataFrame.idxmincs\|j|�}tj|j||d�}t|tj�s,t�|j|���fdd�|D�}|j	||j
|�d�S)a.
        Return index of first occurrence of maximum over requested axis.

        NA/null values are excluded.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to use. 0 or 'index' for row-wise, 1 or 'columns' for column-wise.
        skipna : bool, default True
            Exclude NA/null values. If an entire row/column is NA, the result
            will be NA.

        Returns
        -------
        Series
            Indexes of maxima along the specified axis.

        Raises
        ------
        ValueError
            * If the row/column is empty

        See Also
        --------
        Series.idxmax : Return index of the maximum element.

        Notes
        -----
        This method is the DataFrame version of ``ndarray.argmax``.

        Examples
        --------
        Consider a dataset containing food consumption in Argentina.

        >>> df = pd.DataFrame({'consumption': [10.51, 103.11, 55.48],
        ...                    'co2_emissions': [37.2, 19.66, 1712]},
        ...                    index=['Pork', 'Wheat Products', 'Beef'])

        >>> df
                        consumption  co2_emissions
        Pork                  10.51         37.20
        Wheat Products       103.11         19.66
        Beef                  55.48       1712.00

        By default, it returns the index for the maximum value in each column.

        >>> df.idxmax()
        consumption     Wheat Products
        co2_emissions             Beef
        dtype: object

        To return the index for the maximum value in each row, use ``axis="columns"``.

        >>> df.idxmax(axis="columns")
        Pork              co2_emissions
        Wheat Products     consumption
        Beef              co2_emissions
        dtype: object
        )r�r�cs"g|]}|dkr�|ntj�qS)r)r�r�)r�r)r�r�r�r��"sz$DataFrame.idxmax.<locals>.<listcomp>)r�)r2rYZ	nanargmaxr�r�r�r�r�r3r	r�)r�r�r�r8r$r�)r�r��idxmaxj"s=

zDataFrame.idxmax)�axis_numr�cCs4|dkr|jS|dkr|jStdt|��d���dS)z/
        Let's be explicit about this.
        rr�zAxis must be 0 or 1 (got rMN)r�r�r��repr)r�r�r�r�r�r��"s
zDataFrame._get_agg_axiscs*|s|n|j�}�fdd�}|j||d�S)a3

        Get the mode(s) of each element along the selected axis.

        The mode of a set of values is the value that appears most often.
        It can be multiple values.

        Parameters
        ----------
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to iterate over while searching for the mode:

            * 0 or 'index' : get mode of each column
            * 1 or 'columns' : get mode of each row.

        numeric_only : bool, default False
            If True, only apply to numeric columns.
        dropna : bool, default True
            Don't consider counts of NaN/NaT.

            .. versionadded:: 0.24.0

        Returns
        -------
        DataFrame
            The modes of each column or row.

        See Also
        --------
        Series.mode : Return the highest frequency value in a Series.
        Series.value_counts : Return the counts of values in a Series.

        Examples
        --------
        >>> df = pd.DataFrame([('bird', 2, 2),
        ...                    ('mammal', 4, np.nan),
        ...                    ('arthropod', 8, 0),
        ...                    ('bird', 2, np.nan)],
        ...                   index=('falcon', 'horse', 'spider', 'ostrich'),
        ...                   columns=('species', 'legs', 'wings'))
        >>> df
                   species  legs  wings
        falcon        bird     2    2.0
        horse       mammal     4    NaN
        spider   arthropod     8    0.0
        ostrich       bird     2    NaN

        By default, missing values are not considered, and the mode of wings
        are both 0 and 2. The second row of species and legs contains ``NaN``,
        because they have only one mode, but the DataFrame has two rows.

        >>> df.mode()
          species  legs  wings
        0    bird   2.0    0.0
        1     NaN   NaN    2.0

        Setting ``dropna=False`` ``NaN`` values are considered and they can be
        the mode (like for wings).

        >>> df.mode(dropna=False)
          species  legs  wings
        0    bird     2    NaN

        Setting ``numeric_only=True``, only the mode of numeric columns is
        computed, and columns of other types are ignored.

        >>> df.mode(numeric_only=True)
           legs  wings
        0   2.0    0.0
        1   NaN    2.0

        To compute the mode over columns and not rows, use the axis parameter:

        >>> df.mode(axis='columns', numeric_only=True)
                   0    1
        falcon   2.0  NaN
        horse    4.0  NaN
        spider   0.0  8.0
        ostrich  2.0  NaN
        cs|j�d�S)N)r:)r�)r)r:r�r�rD#szDataFrame.mode.<locals>.f)r�)r�r�)r�r�r�r:r�rDr�)r:r�r��"sPzDataFrame.mode��?�linearc	Cs�t|�|r|j�n|}|j|�}|dk}|r4|j}t|j�dkr~tg|jjd�}t|�rj|j	g||d�S|j
g||tjd�S|j
j|d||d�}|jdkr�|j	|�}n|j
||d�}|r�|j}|S)a�	
        Return values at the given quantile over requested axis.

        Parameters
        ----------
        q : float or array-like, default 0.5 (50% quantile)
            Value between 0 <= q <= 1, the quantile(s) to compute.
        axis : {0, 1, 'index', 'columns'}, default 0
            Equals 0 or 'index' for row-wise, 1 or 'columns' for column-wise.
        numeric_only : bool, default True
            If False, the quantile of datetime and timedelta data will be
            computed as well.
        interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'}
            This optional parameter specifies the interpolation method to use,
            when the desired quantile lies between two data points `i` and `j`:

            * linear: `i + (j - i) * fraction`, where `fraction` is the
              fractional part of the index surrounded by `i` and `j`.
            * lower: `i`.
            * higher: `j`.
            * nearest: `i` or `j` whichever is nearest.
            * midpoint: (`i` + `j`) / 2.

        Returns
        -------
        Series or DataFrame

            If ``q`` is an array, a DataFrame will be returned where the
              index is ``q``, the columns are the columns of self, and the
              values are the quantiles.
            If ``q`` is a float, a Series will be returned where the
              index is the columns of self and the values are the quantiles.

        See Also
        --------
        core.window.Rolling.quantile: Rolling quantile.
        numpy.percentile: Numpy function to compute the percentile.

        Examples
        --------
        >>> df = pd.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]),
        ...                   columns=['a', 'b'])
        >>> df.quantile(.1)
        a    1.3
        b    3.7
        Name: 0.1, dtype: float64
        >>> df.quantile([.1, .5])
               a     b
        0.1  1.3   3.7
        0.5  2.5  55.0

        Specifying `numeric_only=False` will also compute the quantile of
        datetime and timedelta data.

        >>> df = pd.DataFrame({'A': [1, 2],
        ...                    'B': [pd.Timestamp('2010'),
        ...                          pd.Timestamp('2011')],
        ...                    'C': [pd.Timedelta('1 days'),
        ...                          pd.Timedelta('2 days')]})
        >>> df.quantile(0.5, numeric_only=False)
        A                    1.5
        B    2010-07-02 12:00:00
        C        1 days 12:00:00
        Name: 0.5, dtype: object
        r�r)r�)r�r�)r�r�r�)�qsr��
interpolationZ
transposedr�)r1r�r2r&r�r�rfr�rMr�r	r�r�r��quantiler�)	r��qr�r�r�r�Z
is_transposedr�r$r�r�r�r�#s&B

zDataFrame.quantile�start)r5r�r�r�c	Cs>|j|d�}|j|�}t||�}|j||d�}t|||�|S)al
        Cast to DatetimeIndex of timestamps, at *beginning* of period.

        Parameters
        ----------
        freq : str, default frequency of PeriodIndex
            Desired frequency.
        how : {'s', 'e', 'start', 'end'}
            Convention for converting period to timestamp; start of period
            vs. end.
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to convert (the index by default).
        copy : bool, default True
            If False then underlying input data is not copied.

        Returns
        -------
        DataFrame with DatetimeIndex
        )r�)rr5)r�r�r��to_timestamp�setattr)	r�rr5r�r�r,�	axis_name�old_ax�new_axr�r�r�r�u#s

zDataFrame.to_timestamp)r�r�r�cCs<|j|d�}|j|�}t||�}|j|d�}t|||�|S)aJ
        Convert DataFrame from DatetimeIndex to PeriodIndex.

        Convert DataFrame from DatetimeIndex to PeriodIndex with desired
        frequency (inferred from index if not passed).

        Parameters
        ----------
        freq : str, default
            Frequency of the PeriodIndex.
        axis : {0 or 'index', 1 or 'columns'}, default 0
            The axis to convert (the index by default).
        copy : bool, default True
            If False then underlying input data is not copied.

        Returns
        -------
        DataFrame with PeriodIndex
        )r�)r)r�r�r��	to_periodr�)r�rr�r�r,r�r�r�r�r�r�r��#s

zDataFrame.to_periodcs�t�t�rDddlm}tjt���|��fdd�t�j�D�dd�St�t	�rr�j
js^td���j
�j��dd�St�t�r��jjo��j
js�td���j
�j���St��s�td	t��j�d
����jtj�jj���j�j��j
�j�SdS)a�
        Whether each element in the DataFrame is contained in values.

        Parameters
        ----------
        values : iterable, Series, DataFrame or dict
            The result will only be true at a location if all the
            labels match. If `values` is a Series, that's the index. If
            `values` is a dict, the keys must be the column names,
            which must match. If `values` is a DataFrame,
            then both the index and column labels must match.

        Returns
        -------
        DataFrame
            DataFrame of booleans showing whether each element in the DataFrame
            is contained in values.

        See Also
        --------
        DataFrame.eq: Equality test for DataFrame.
        Series.isin: Equivalent method on Series.
        Series.str.contains: Test if pattern or regex is contained within a
            string of a Series or Index.

        Examples
        --------
        >>> df = pd.DataFrame({'num_legs': [2, 4], 'num_wings': [2, 0]},
        ...                   index=['falcon', 'dog'])
        >>> df
                num_legs  num_wings
        falcon         2          2
        dog            4          0

        When ``values`` is a list check whether every value in the DataFrame
        is present in the list (which animals have 0 or 2 legs or wings)

        >>> df.isin([0, 2])
                num_legs  num_wings
        falcon      True       True
        dog        False       True

        When ``values`` is a dict, we can pass values to check for each
        column separately:

        >>> df.isin({'num_wings': [0, 3]})
                num_legs  num_wings
        falcon     False      False
        dog        False       True

        When ``values`` is a Series or DataFrame the index and column must
        match. Note that 'falcon' does not match based on the number of legs
        in df2.

        >>> other = pd.DataFrame({'num_legs': [8, 2], 'num_wings': [0, 2]},
        ...                      index=['spider', 'falcon'])
        >>> df.isin(other)
                num_legs  num_wings
        falcon      True       True
        dog        False      False
        r)r�c3s0|](\}}�jdd�|gfj�|�VqdS)N)r�r�)r�rrW)r�r�r�r�r��#sz!DataFrame.isin.<locals>.<genexpr>r�)r�z*cannot compute isin with a duplicate axis.r�z`only list-like or dict-like objects are allowed to be passed to DataFrame.isin(), you passed a '�'N)r�r�r�r�r�defaultdictr�rr�rzr�rr��eqr|r�rMr�r �__name__r�rWr�r�Zravelr�r�)r�r�r�r�)r�r�r�r��#s,>


zDataFrame.isin)r�r�z(The index (row labels) of the DataFrame.)r�r-rfz#The column labels of the DataFrame.cst�jddd�S)z.. deprecated:: 1.1.0rr�)r�r�)r�
_AXIS_NUMBERS)r�)rr�r�r�#$szDataFrame._AXIS_NUMBERScst�jddd�S)z.. deprecated:: 1.1.0r�r�)rr�)r�_AXIS_NAMES)r�)rr�r�r�)$szDataFrame._AXIS_NAMES�plot)NNNNF)F)NNNTTr�NNNTNNNNFr�NNN)Tr)r�NN)	NNFrEFNNTN)NNNFN)TNN)NT)	NTNNNNruNrv)NNT)r�r�NN)NNNTTr�NNNTNNNFr�TNTFNNFN)NNNNN)TF)r)F)F)F)F)NN)F)T)NNN)	rNNTNNNrN)rF)r�N)r�T)r�N)r�N)r�N)NrNNNFr)r�T)r�F)r�N)rr)N)NNNFNN)NNFNFr)r�NrN)TFFF)NFFrr�)rr�NNF)Nr;FF)Nr;)rTFrKrLFN)	rNTFrKrLTFN)NFTF)r;)r;���r�)rr�r)r)N)r�FF)NT)r"TNr)NNN)	NNNr�NFTr�Fr�)r�T)Fr�)r�N)NNNr�NT)r�r)N)Nr)r)r)FFF)Nr"r�r�F)Nr"r�r�F�r�r�)r�NNNFFFrTFN)r)r�r�)Nr�)rFr�)rNF)rF)rTNN)rT)rT)rT)rFT)r�rTr�)Nr�rT)NrT)�r��
__module__�__qualname__�__doc__rcZ_internal_names_setZ_typ�propertyrr�rzr	Z
_deprecationsr�r
r�Z
_accessorsrr�rrrr�r�rrfr�r�intr�r�r�r�r�r�r�r�r+r�Zcommon_docstringZreturn_docstringr rrZFormattersTypeZFloatFormatTyper�r�rdr*rr#rrr
rrrr%r(�classmethodr.rrr�r�r2r�r8rPreZrecarrayrprrr,r	�datetimerr�r�r-r��_shared_doc_kwargsrrr�r�rr�r�r'r&rrr�r
r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�r�rr�r�r�r�r�r�r�r�rrr	rrr.rbr%r$rrr�r�rrrr rr.rTr0rVr1r:r>rrVr&r"r[r`rbrcrdrfrirkrlrrr{rar]r�r�r�r�r�ryrIrr!r�Z_agg_summary_and_see_also_docZ_agg_examples_docr�r�Zaggr�r�r�rrr��
_merge_docr�r�r�r�r�rlr�r�r�r�r�r�r�r�r�r�r�Z_AXIS_ORDERSZ_AXIS_TO_AXIS_NUMBERr1r�Z	_AXIS_LENZ_info_axis_numberZ_info_axis_namerZAxisPropertyr�r�r�r�r[r�ZplottingZPlotAccessorr�Z
hist_frame�histZ
boxplot_frameZboxplotrar��
__classcell__r�r�)rr�r�Ts�
L$y$5	.v7?

3Rn	QB1`(&&
\	

"`ET.^z&='$!
#vHn	2



*w
+ ">s$\ 
8$&[mf!]	(NQ0*N=@!
	={4rZt`g1 (IIWa
`  cCsFtjt�}x6|j�D]*\}}x |j�D]\}}||||<q&WqW|S)N)rr�r�r)r�rYr�rrWrr�r�r�r*>$s

r*)�rrrr�iorr[�textwraprZtypingrrrrr	r
rrr
rrrrrrrrrr@Znumpyr�Znumpy.mar�Zpandas._configrZpandas._libsrr�rrZpandas._libs.librZpandas._typingrrrrr r!r"r#r$r%r&Z
pandas.compatr'Zpandas.compat._optionalr(Zpandas.compat.numpyr)r�Zpandas.util._decoratorsr*r+r,r-r.Zpandas.util._validatorsr/r0r1Zpandas.core.dtypes.castr2r3r4r5r6r7r8r9r:r;r<r=r>Zpandas.core.dtypes.commonr?r@rArBrCrDrErFrGrHrIrJrKrLrMrNrOrPrQrRrSZpandas.core.dtypes.missingrTrUrVZpandas.corerWrXr;rYrZZpandas.core.accessorr[Zpandas.core.aggregationr\r]Zpandas.core.arraysr^r_Zpandas.core.arrays.datetimeliker`r%Zpandas.core.arrays.sparseraZpandas.core.constructionrbZpandas.core.genericrcrdZpandas.core.indexesrer�Zpandas.core.indexes.apirfrgrhZpandas.core.indexes.datetimesriZpandas.core.indexes.multirjrkZpandas.core.indexes.periodrlZpandas.core.indexingrmrnZpandas.core.internalsroZ"pandas.core.internals.constructionrprqrrrsrtrurvrwrxZpandas.core.reshape.meltryZpandas.core.seriesrzrFr{Zpandas.io.commonr|Zpandas.io.formatsr}r~r�Zpandas.io.formats.inforZpandas.plottingr�r�r�r�r�r�r	Z_numeric_only_docr
r�Z_add_numeric_operationsZ#_add_series_or_dataframe_operationsZadd_flex_arithmetic_methodsZadd_special_arithmetic_methodsr*r�r�r�r��<module>
sP
4
<\,)