3
���h{o � @ s� d Z ddlmZ ddlmZmZ ddlZddlZddlm Z m
Z
ddlZddl
mZ ddlmZ ddljjjZddlmZ ejeejd �d
d�ZG dd
� d
�ZG dd� d�ZG dd� deej�ZdS )a�
Read SAS7BDAT files
Based on code written by Jared Hobbs:
https://bitbucket.org/jaredhobbs/sas7bdat
See also:
https://github.com/BioStatMatt/sas7bdat
Partial documentation of the file format:
https://cran.r-project.org/package=sas7bdat/vignettes/sas7bdat.pdf
Reference for binary data compression:
http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Website/articles/CUJ/1992/9210/ross/ross.htm
� )�abc)�datetime� timedeltaN)�EmptyDataError�OutOfBoundsDatetime)�get_filepath_or_buffer)�Parser)�
ReaderBase)�
sas_datetimes�unit�returnc C s^ yt j| |dd�S tk
rX |dkr6| jdd� �S |dkrL| jdd� �S td��Y nX d S )
a�
Convert to Timestamp if possible, otherwise to datetime.datetime.
SAS float64 lacks precision for more than ms resolution so the fit
to datetime.datetime is ok.
Parameters
----------
sas_datetimes : {Series, Sequence[float]}
Dates or datetimes in SAS
unit : {str}
"d" if the floats represent dates, "s" for datetimes
Returns
-------
Series
Series of datetime64 dtype or datetime.datetime.
z
1960-01-01)r �origin�sc S s t ddd�t| d� S )Ni� � )�seconds)r r )� sas_float� r �8/tmp/pip-build-5_djhm0z/pandas/pandas/io/sas/sas7bdat.py�<lambda>7 s z$_convert_datetimes.<locals>.<lambda>�dc S s t ddd�t| d� S )Ni� r )�days)r r )r r r r r ; s zunit must be 'd' or 's'N)�pdZto_datetimer �apply�
ValueError)r
r r r r �_convert_datetimes s
r c @ s e Zd ZdS )�_subheader_pointerN)�__name__�
__module__�__qualname__r r r r r A s r c @ s e Zd ZdS )�_columnN)r r r r r r r r E s r c @ s� e Zd ZdZd>dd�Zdd� Zdd � Zd
d� Zdd
� Zdd� Z dd� Z
dd� Zdd� Zdd� Z
dd� Zdd� Zdd� Zdd� Zd d!� Zd"d#� Zd$d%� Zd&d'� Zd(d)� Zd*d+� Zd,d-� Zd.d/� Zd0d1� Zd2d3� Zd4d5� Zd6d7� Zd?d8d9�Zd:d;� Zd<d=� Z dS )@�SAS7BDATReadera!
Read SAS files in SAS7BDAT format.
Parameters
----------
path_or_buf : path name or buffer
Name of SAS file or file-like object pointing to SAS file
contents.
index : column identifier, defaults to None
Column to use as index.
convert_dates : boolean, defaults to True
Attempt to convert dates to Pandas datetime values. Note that
some rarely used SAS date formats may be unsupported.
blank_missing : boolean, defaults to True
Convert empty strings to missing values (SAS uses blanks to
indicate missing character variables).
chunksize : int, defaults to None
Return SAS7BDATReader object for iterations, returns chunks
with given number of lines.
encoding : string, defaults to None
String encoding.
convert_text : bool, defaults to True
If False, text variables are left as raw bytes.
convert_header_text : bool, defaults to True
If False, header text, including column names, are left as raw
bytes.
NTc
C s� || _ || _|| _|| _|| _|| _|| _d| _d| _g | _ g | _
g | _g | _g | _
d | _g | _g | _g | _d| _d| _d| _t|�\| _} } } t| jt�r�t| jd�| _| j| _| j� | j� d S )Nzlatin-1� r �rb)�index�
convert_dates�
blank_missing� chunksize�encoding�convert_text�convert_header_text�default_encoding�compression�column_names_strings�column_names�column_formats�columns�%_current_page_data_subheader_pointers�_cached_page�_column_data_lengths�_column_data_offsets�
_column_types�_current_row_in_file_indexZ_current_row_on_page_indexr �_path_or_buf�
isinstance�str�open�handle�_get_properties�_parse_metadata)
�selfZpath_or_bufr# r$ r% r& |