Próbuję zrzucić plik csv do tabeli cassandra za pomocą polecenia COPY. Ale liczba wierszy w moim pliku csv i liczba wierszy w Cassandrze nie są spójne.

Liczba wierszy w plikach CSV: 49765 (bez nagłówka)

Liczba rzędów w tabeli cassandra:

cqlsh:test_df> select Count(*) from test_table;

 count
-------
 46982

(1 rows)

Warnings :
Aggregation query used without partition key

Polecenie kopiowania :

COPY test_table (column1,column2,column3) from 'temp.csv' with delimiter = ',' and header = True;

Błąd:

Starting copy of test_df.test_bhavcopy with columns [symbol, instrument, expiry_dt, strike_pr, option_typ, open, high, low, close, settle_pr, contracts, val_inlakh, open_int, ch_in_oi, price_date, key].
Process ImportProcess-3:ate:  8387 rows/s; Avg. rate:  3937 rows/s
Traceback (most recent call last):
P rocess ImportProcess-2:
 File "X:\Anaconda\lib\multiprocessing\process.py", line 267, in _bootstrap
Traceback (most recent call last):
Process ImportProcess-1:
T raceback (most recent call last):
 File "X:\Anaconda\lib\multiprocessing\process.py", line 267, in _bootstrap
 File "X:\Anaconda\lib\multiprocessing\process.py", line 267, in _bootstrap
  self.run()
  File "X:\apache-cassandra-3.11.3\bin\..\pylib\cqlshlib\copyutil.py", line 2328, in run
  self.run()
  self.run()
  File "X:\apache-cassandra-3.11.3\bin\..\pylib\cqlshlib\copyutil.py", line 2328, in run
 File "X:\apache-cassandra-3.11.3\bin\..\pylib\cqlshlib\copyutil.py", line 2328, in run
  self.close()
 File "X:\apache-cassandra-3.11.3\bin\..\pylib\cqlshlib\copyutil.py", line 2332, in close
  self._session.cluster.shutdown()
   self.close()
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\cluster.py", line 1259, in shutdown
  self.close()
  File "X:\apache-cassandra-3.11.3\bin\..\pylib\cqlshlib\copyutil.py", line 2332, in close
 File "X:\apache-cassandra-3.11.3\bin\..\pylib\cqlshlib\copyutil.py", line 2332, in close
   self._session.cluster.shutdown()
  self._session.cluster.shutdown()
  File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\cluster.py", line 1259, in shutdown
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\cluster.py", line 1259, in shutdown
  self.control_connection.shutdown()
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\cluster.py", line 2850, in shutdown
  self._connection.close()
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\io\asyncorereactor.py", line 373, in close
  AsyncoreConnection.create_timer(0, partial(asyncore.dispatcher.close, self))
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\io\asyncorereactor.py", line 335, in create_timer
  cls._loop.add_timer(timer)
A ttributeError: 'NoneType' object has no attribute 'add_timer'
  self.control_connection.shutdown()
  File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\cluster.py", line 2850, in shutdown
  self.control_connection.shutdown()
   self._connection.close()
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\cluster.py", line 2850, in shutdown
  File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\io\asyncorereactor.py", line 373, in close
  self._connection.close()
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\io\asyncorereactor.py", line 373, in close
  AsyncoreConnection.create_timer(0, partial(asyncore.dispatcher.close, self))
   AsyncoreConnection.create_timer(0, partial(asyncore.dispatcher.close, self))
 File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\io\asyncorereactor.py", line 335, in create_timer
  File "X:\apache-cassandra-3.11.3\bin\..\lib\cassandra-driver-internal-only-3.11.0-bb96859b.zip\cassandra-driver-3.11.0-bb96859b\cassandra\io\asyncorereactor.py", line 335, in create_timer
  cls._loop.add_timer(timer)
 A  cls._loop.add_timer(timer)
ttributeError: 'NoneType' object has no attribute 'add_timer'
AttributeError: 'NoneType' object has no attribute 'add_timer'
Processed: 49765 rows; Rate:  4193 rows/s; Avg. rate:  3906 rows/s
49765 rows imported from 1 files in 12.742 seconds (0 skipped).

Może to z powodu tego błędu.

2
Krishna Kumar 19 listopad 2018, 12:02

1 odpowiedź

Najlepsza odpowiedź

Znalazłem poprawkę: Edytowałem moje asyncorereactor.py w

cassandra-driver-internal-only-3.11.0-bb96859b.zip/cassandra-driver-3.11.0-bb96859b/cassandra/io/asyncorereactor.py

Do self.create_timer() z AsyncoreConnection.create_timer() zgodnie z sugestią w tym poście

https://datastax-oss.atlassian.net/browse/PYTHON-862?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

1
Krishna Kumar 20 listopad 2018, 12:23