�`^c@s�dZddlZddlmZddlmZmZmZmZm Z m
Z
mZmZm
Z
mZmZmZmZmZddlmZyddlmZWn!ek
r�ddlmZnXddd d
ddd
dddddddddddddgZdd"d��YZdefd��YZe de�defd��YZe de�dd#d��YZdd$d ��YZyeWnek
r�eZnXdd%d!��YZdS(&s+
csv.py - read/write/investigate CSV files
i�N(treduce(tErrort__version__twritertreadertregister_dialecttunregister_dialecttget_dialectt
list_dialectstfield_size_limitt
QUOTE_MINIMALt QUOTE_ALLtQUOTE_NONNUMERICt
QUOTE_NONEt__doc__(tDialect(tStringIOR
RRR
RRRtexcelt excel_tabR RRRRRtSnifferRRt
DictReadert
DictWritercBsVeZdZdZeZdZdZdZ dZ
dZdZdZ
d�Zd�ZRS(s�Describe an Excel dialect.
This must be subclassed (see csv.excel). Valid attributes are:
delimiter, quotechar, escapechar, doublequote, skipinitialspace,
lineterminator, quoting.
tcCs)|jtkrt|_n|j�dS(N(t __class__RtTruet_validt _validate(tself((s/sys/lib/python2.7/csv.pyt__init__-scCs:yt|�Wn%tk
r5}tt|���nXdS(N(t_Dialectt TypeErrorRtstr(Rte((s/sys/lib/python2.7/csv.pyR2sN(t__name__t
__module__Rt_nametFalseRtNonet delimitert quotechart
escapechartdoublequotetskipinitialspacetlineterminatortquotingRR(((s/sys/lib/python2.7/csv.pyRs cBs2eZdZdZdZeZeZdZ e
ZRS(s;Describe the usual properties of Excel-generated CSV files.t,t"s
(R!R"RR&R'RR)R$R*R+R
R,(((s/sys/lib/python2.7/csv.pyR9scBseZdZdZRS(sEDescribe the usual properties of Excel-generated TAB-delimited files.s (R!R"RR&(((s/sys/lib/python2.7/csv.pyRCss excel-tabcBsPeZddddd�Zd�Zed��Zejd��Zd�ZRS(RcOsI||_||_||_t||||�|_||_d|_dS(Ni(t_fieldnamestrestkeytrestvalRtdialecttline_num(Rtft
fieldnamesR0R1R2targstkwds((s/sys/lib/python2.7/csv.pyRJs cCs|S(N((R((s/sys/lib/python2.7/csv.pyt__iter__SscCsR|jdkr<y|jj�|_Wq<tk
r8q<Xn|jj|_|jS(N(R/R%Rtnextt
StopIterationR3(R((s/sys/lib/python2.7/csv.pyR5Vs
cCs
||_dS(N(R/(Rtvalue((s/sys/lib/python2.7/csv.pyR5dscCs�|jdkr|jn|jj�}|jj|_x|gkrX|jj�}q:Wtt|j|��}t|j�}t|�}||kr�||||j<n4||kr�%|j|D]}|j||<q�Wn|S(Ni( R3R5RR9tdicttziptlenR0R1(Rtrowtdtlftlrtkey((s/sys/lib/python2.7/csv.pyR9hs
N( R!R"R%RR8tpropertyR5tsetterR9(((s/sys/lib/python2.7/csv.pyRIs cBs>eZdddd�Zd�Zd�Zd�Zd�ZRS(RtraiseRcOsY||_||_|j�dkr4td|�n||_t||||�|_dS(NRFtignores-extrasaction (%s) must be 'raise' or 'ignore'(RFsignore(R5R1tlowert
ValueErrortextrasactionR(RR4R5R1RJR2R6R7((s/sys/lib/python2.7/csv.pyR�s
cCs,tt|j|j��}|j|�dS(N(R<R=R5twriterow(Rtheader((s/sys/lib/python2.7/csv.pytwriteheader�scCs�|jdkrug|D]}||jkr|^q}|rutddjg|D]}t|�^qP���qung|jD]}|j||j�^qS(NRFs(dict contains fields not in fieldnames: s, (RJR5RItjointreprtgetR1(Rtrowdicttktwrong_fieldstxRC((s/sys/lib/python2.7/csv.pyt
_dict_to_list�s(2cCs|jj|j|��S(N(RRKRU(RRQ((s/sys/lib/python2.7/csv.pyRK�scCs=g}x$|D]}|j|j|��q
W|jj|�S(N(tappendRURt writerows(RtrowdictstrowsRQ((s/sys/lib/python2.7/csv.pyRW�s
(R!R"RRMRURKRW(((s/sys/lib/python2.7/csv.pyRs
cBs>eZdZd�Zdd�Zd�Zd�Zd�ZRS(se
"Sniffs" the format of a CSV file (i.e. delimiter, quotechar)
Returns a Dialect object.
cCsdddddg|_dS(NR-s t;t t:(t preferred(R((s/sys/lib/python2.7/csv.pyR�scCs�|j||�\}}}}|s?|j||�\}}n|sQtd�ndtfd��Y}||_||_|p�d|_||_|S(sI
Returns a dialect (or None) corresponding to the sample
sCould not determine delimiterR2cBseZdZdZeZRS(tsniffeds
(R!R"R#R+R
R,(((s/sys/lib/python2.7/csv.pyR2�sR.(t_guess_quote_and_delimitert_guess_delimiterRRR)R&R'R*(Rtsamplet
delimitersR'R)R&R*R2((s/sys/lib/python2.7/csv.pytsniff�s cCsEg}xCdD];}tj|tjtjB�}|j|�}|r
Pq
q
W|sbdtddfSi}i}d}x|D]�|jdd}
| |
}|r�|j|d�d||<ny|jd d}
| |
}Wnt k
r�q{nX|r0|dks||kr0|j|d�d||<ny|jd
d}
Wnt k
r[q{nX| |
r{|d7}q{q{Wt
|d�|j��}|r�t
|d�|j��}
||
|k}|
d
kr�}
q�d}
d}tjditj|
�d 6|d6tj�}|j
|�r/t}nt}|||
|fS(s�
Looks for text enclosed between two identical quotes
(the probable quotechar) which are preceded and followed
by the same character (the probable delimiter).
For example:
,'some text',
The quote with the most wins, same with the delimiter.
If there is no quotechar the delimiter can't be determined
this way.
sF(?P<delim>[^\w
"'])(?P<space> ?)(?P<quote>["']).*?(?P=quote)(?P=delim)sC(?:^|
)(?P<quote>["']).*?(?P=quote)(?P<delim>[^\w
"'])(?P<space> ?)sD(?P<delim>>[^\w
"'])(?P<space> ?)(?P<quote>["']).*?(?P=quote)(?:$|
)s*(?:^|
)(?P<quote>["']).*?(?P=quote)(?:$|
)RitquoteitdelimtspacecSs||||kr|p|S(N((tatbtquotes((s/sys/lib/python2.7/csv.pyt<lambda>�scSs||||kr|p|S(N((RgRhtdelims((s/sys/lib/python2.7/csv.pyRjss
s]((%(delim)s)|^)\W*%(quote)s[^%(delim)s\n]*%(quote)s[^%(delim)s\n]*%(quote)s\W*((%(delim)s)|$)(sF(?P<delim>[^\w
"'])(?P<space> ?)(?P<quote>["']).*?(?P=quote)(?P=delim)sC(?:^|
)(?P<quote>["']).*?(?P=quote)(?P<delim>[^\w
"'])(?P<space> ?)sD(?P<delim>>[^\w
"'])(?P<space> ?)(?P<quote>["']).*?(?P=quote)(?:$|
)s*(?:^|
)(?P<quote>["']).*?(?P=quote)(?:$|
)N(tretcompiletDOTALLt MULTILINEtfindallR$R%t
groupindexRPtKeyErrorRtkeystescapetsearchR(RtdataRbtmatchestrestrtregexpRiRktspacestmtnRCR'ReR*t dq_regexpR)((s/sys/lib/python2.7/csv.pyR_�sb
' cCstd|jd��}gtd�D]}t|�^q%}tdt|��}d}i}i}i} dt|t|��}
}x�|
t|�kr|d7}xk||
|!D]\}xS|D]K}
|j|
i�}|j|
�}|j|d�d||<|||
<q�Wq�Wx�|j �D]�}
||
j
�}t|�dkrb|dddkrbq nt|�dkr�td�|�||
<|j||
�||
d||
dtd�|�df||
<q |d||
<q W|j
�}t
||�}d}d }x�t| �dkr�||kr�xp|D]h\}}|ddkr4|ddkr4|d||kr�|dks�||kr�|| |<q�q4q4W|d
8}qWt| �dkr| j �d}|dj|�|djd|�k}||fS|}
||7}q�W| s"dSt| �dkr�xZ|jD]L}|| j �kr>|dj|�|djd|�k}||fSq>Wng| j
�D]\}}||f^q�}|j�|d
d}|dj|�|djd|�k}||fS(s�
The delimiter /should/ occur the same number of times on
each row. However, due to malformed data, it may not. We don't want
an all or nothing approach, so we allow for small variations in this
number.
1) build a table of the frequency of each character on every line.
2) build a table of frequencies of this frequency (meta-frequency?),
e.g. 'x occurred 5 times in 10 rows, 6 times in 1000 rows,
7 times in 2 rows'
3) use the mode of the meta-frequency to determine the /expected/
frequency for that character
4) find out how often the character actually meets that goal
5) the character that best meets its goal is the delimiter
For performance reasons, the data is evaluated in chunks, so it can
try and evaluate the smallest portion of the data possible, evaluating
additional chunks as necessary.
s
ii
iicSs|d|dkr|p|S(Ni((RgRh((s/sys/lib/python2.7/csv.pyRjIscSsd|d|dfS(Nii((RgRh((s/sys/lib/python2.7/csv.pyRjOsg����?g{��?s%c Ri�N(Ri(tfilterR%tsplittrangetchrtminR>RPtcountRstitemsRtremovetfloatR]tsort(RRvRbtctasciitchunkLengtht iterationt
charFrequencytmodesRktstarttendtlinetchart
metaFrequencytfreqR�tmodeListttotaltconsistencyt thresholdRRtvReR*R@((s/sys/lib/python2.7/csv.pyR`sx%
&
!
+
c
Cstt|�|j|��}|j�}t|�}i}xt|�D]}d||<qIWd}x�D]�|dkr�Pn|d7}t|�|kr�qjnx�|j�D]�} xWtt t
tgD]3}
y|
|| �PWq�tt
fk
r�q�Xq�Wt|| �}
|
t kr$t}
n|
|| kr�|| dkrQ|
|| <q[|| =q�q�WqjWd}x�|j�D]�\} }t|�td�kr�t|| �|kr�|d7}q
|d8}qvy||| �Wn!ttfk
r�|d7}qvX|d8}qvW|dkS(Niii(RRRcR9R>R�R%RstinttlongR�tcomplexRIt
OverflowErrorR�ttypeR(
RRatrdrRLtcolumnstcolumnTypestitcheckedR?tcoltthisTypet hasHeadertcolType((s/sys/lib/python2.7/csv.pyt
has_header�sN
N( R!R"RRR%RcR_R`R�(((s/sys/lib/python2.7/csv.pyR�s M i((((( RRlt functoolsRt_csvRRRRRRRRR R
RRR
RRt cStringIORtImportErrort__all__RRRRR�t NameErrorR�R(((s/sys/lib/python2.7/csv.pyt<module>s2^
6"
|