HOME

sh-3ll 1.0
DIR:/usr/local/lib64/python3.6/site-packages/pandas/io/__pycache__/
Current File : //usr/local/lib64/python3.6/site-packages/pandas/io/__pycache__/parquet.cpython-36.pyc
3

���hd*�	@s�dZddlmZmZmZmZmZddlmZddl	m
Z
ddlmZddl
mZddlmZmZddlmZmZmZed	d
�dd�ZGd
d	�d	�ZGdd�de�ZGdd�de�Zdee
eeeeeeeeed�dd�Zded�dd�ZdS)z parquet compat �)�Any�AnyStr�Dict�List�Optional)�catch_warnings)�FilePathOrBuffer)�import_optional_dependency)�AbstractMethodError)�	DataFrame�
get_option)�_expand_user�get_filepath_or_buffer�
is_fsspec_url�BaseImpl)�engine�returncCs�|dkrtd�}|dkrzttg}d}xF|D]>}y|�Stk
rf}z|dt|�7}WYdd}~Xq*Xq*Wtd|����|dkr�t�S|dkr�t�Std	��dS)
z return our implementation �autozio.parquet.engine�z
 - Nz�Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'.
A suitable version of pyarrow or fastparquet is required for parquet support.
Trying to import the above resulted in these errors:�pyarrow�fastparquetz.engine must be one of 'pyarrow', 'fastparquet')r�PyArrowImpl�FastParquetImpl�ImportError�str�
ValueError)rZengine_classesZ
error_msgsZengine_class�err�r�3/tmp/pip-build-5_djhm0z/pandas/pandas/io/parquet.py�
get_engines"
&	rc@s6eZdZeed�dd��Zed�dd�Zd	dd�ZdS)
r)�dfcCsLt|t�std��|jjdkr&td��tdd�|jjD��}|sHtd��dS)	Nz+to_parquet only supports IO with DataFrames�string�emptyz%parquet must have string column namescss |]}|dk	rt|t�VqdS)N)�
isinstancer)�.0�namerrr�	<genexpr>>sz.BaseImpl.validate_dataframe.<locals>.<genexpr>z!Index level names must be strings>r!r")r#rr�columnsZ
inferred_type�all�index�names)r Zvalid_namesrrr�validate_dataframe2s
zBaseImpl.validate_dataframecKst|��dS)N)r
)�selfr �path�compression�kwargsrrr�writeCszBaseImpl.writeNcKst|��dS)N)r
)r,r-r'r/rrr�readFsz
BaseImpl.read)N)�__name__�
__module__�__qualname__�staticmethodrr+r0r1rrrrr1sc@sJeZdZdd�Zd
eeeeeee	ee
ed�dd�Zddd	�ZdS)rcCs&tddd�ddl}ddl}||_dS)Nrz(pyarrow is required for parquet support.)�extrar)r	Zpyarrow.parquetZpandas.core.arrays._arrow_utils�api)r,r�pandasrrr�__init__Ks

zPyArrowImpl.__init__�snappyN)r r-r.r)�partition_colscKs�|j|�d|jdd�i}|dk	r*||d<|jjj|f|�}t|�rvd|krvtd�ddl}	|	jj	|�\}
}|
|d<nt
|�}|dk	r�|jjj||f||d�|��n|jjj
||fd|i|��dS)NZschemaZpreserve_index�
filesystem�fsspecr)r.r;r.)r+�popr7ZTableZfrom_pandasrr	�fsspec.core�core�	url_to_fsr
�parquetZwrite_to_datasetZwrite_table)r,r r-r.r)r;r/Zfrom_pandas_kwargs�tabler=�fsrrrr0Vs&	


zPyArrowImpl.writec	Ks�t|�r6d|kr6td�ddl}|jj|�\}}d}n|jdd�}d}t|�}|sbt|�\}}}}d|d<|jj	j
|f||d�|��j�}|r�|j�|S)Nr<r=rFTZuse_pandas_metadata)r'r<)
rr	r?r@rAr>r
rr7rBZ
read_table�	to_pandas�close)	r,r-r'r/r=rDZshould_close�_�resultrrrr1}s zPyArrowImpl.read)r:NN)N)
r2r3r4r9rrrrr�boolrr0r1rrrrrJs
,!rc@s.eZdZdd�Zd
ed�dd�Zddd	�ZdS)rcCstddd�}||_dS)Nrz,fastparquet is required for parquet support.)r6)r	r7)r,rrrrr9�s
zFastParquetImpl.__init__r:N)r c
s�|j|�d|kr$|dk	r$td��nd|kr6|jd�}|dk	rFd|d<t|�rhtd���fdd�|d<nt|�\}}}}td	d
��$|jj||f|||d�|��WdQRXdS)N�partition_onzYCannot use both partition_on and partition_cols. Use partition_cols for partitioning dataZhiveZfile_schemer=cs�j|d�j�S)N�wb)�open)r-rG)r=rr�<lambda>�sz'FastParquetImpl.write.<locals>.<lambda>�	open_withT)�record)r.Zwrite_indexrJ)	r+rr>rr	rrr7r0)r,r r-r.r)r;r/rGr)r=rr0�s(	

zFastParquetImpl.writecs^t|�r.td���fdd�}|jj||d�}nt|�\}}}}|jj|�}|jfd|i|��S)Nr=cs�j|d�j�S)N�rb)rL)r-rG)r=rrrM�sz&FastParquetImpl.read.<locals>.<lambda>)rNr')rr	r7ZParquetFilerrE)r,r-r'r/rNZparquet_filerGr)r=rr1�szFastParquetImpl.read)r:NN)N)r2r3r4r9rr0r1rrrrr�s
%rrr:N)r r-rr.r)r;cKs4t|t�r|g}t|�}|j||f|||d�|��S)a�
    Write a DataFrame to the parquet format.

    Parameters
    ----------
    df : DataFrame
    path : str or file-like object
        If a string, it will be used as Root Directory path
        when writing a partitioned dataset. By file-like object,
        we refer to objects with a write() method, such as a file handler
        (e.g. via builtin open function) or io.BytesIO. The engine
        fastparquet does not accept file-like objects.

        .. versionchanged:: 0.24.0

    engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
        Parquet library to use. If 'auto', then the option
        ``io.parquet.engine`` is used. The default ``io.parquet.engine``
        behavior is to try 'pyarrow', falling back to 'fastparquet' if
        'pyarrow' is unavailable.
    compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
        Name of the compression to use. Use ``None`` for no compression.
    index : bool, default None
        If ``True``, include the dataframe's index(es) in the file output. If
        ``False``, they will not be written to the file.
        If ``None``, similar to ``True`` the dataframe's index(es)
        will be saved. However, instead of being saved as values,
        the RangeIndex will be stored as a range in the metadata so it
        doesn't require much space and is faster. Other indexes will
        be included as columns in the file output.

        .. versionadded:: 0.24.0

    partition_cols : str or list, optional, default None
        Column names by which to partition the dataset.
        Columns are partitioned in the order they are given.
        Must be None if path is not a string.

        .. versionadded:: 0.24.0

    kwargs
        Additional keyword arguments passed to the engine
    )r.r)r;)r#rrr0)r r-rr.r)r;r/�implrrr�
to_parquet�s4
rR)rcKst|�}|j|fd|i|��S)a�
    Load a parquet object from the file path, returning a DataFrame.

    Parameters
    ----------
    path : str, path object or file-like object
        Any valid string path is acceptable. The string could be a URL. Valid
        URL schemes include http, ftp, s3, and file. For file URLs, a host is
        expected. A local file could be:
        ``file://localhost/path/to/table.parquet``.
        A file URL can also be a path to a directory that contains multiple
        partitioned parquet files. Both pyarrow and fastparquet support
        paths to directories as well as file URLs. A directory path could be:
        ``file://localhost/path/to/tables`` or ``s3://bucket/partition_dir``

        If you want to pass in a path object, pandas accepts any
        ``os.PathLike``.

        By file-like object, we refer to objects with a ``read()`` method,
        such as a file handler (e.g. via builtin ``open`` function)
        or ``StringIO``.
    engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
        Parquet library to use. If 'auto', then the option
        ``io.parquet.engine`` is used. The default ``io.parquet.engine``
        behavior is to try 'pyarrow', falling back to 'fastparquet' if
        'pyarrow' is unavailable.
    columns : list, default=None
        If not None, only these columns will be read from the file.
    **kwargs
        Any additional kwargs are passed to the engine.

    Returns
    -------
    DataFrame
    r')rr1)r-rr'r/rQrrr�read_parquets$rS)rr:NN)rN)�__doc__Ztypingrrrrr�warningsrZpandas._typingrZpandas.compat._optionalr	Z
pandas.errorsr
r8rrZpandas.io.commonr
rrrrrrrrIrRrSrrrr�<module>s "LD.;