Skip to content
This repository has been archived by the owner on Dec 26, 2022. It is now read-only.

DB : Import historical data for Permanode into Scylladb #486

Open
YingHan-Chen opened this issue Feb 17, 2020 · 5 comments
Open

DB : Import historical data for Permanode into Scylladb #486

YingHan-Chen opened this issue Feb 17, 2020 · 5 comments

Comments

@YingHan-Chen
Copy link
Contributor

YingHan-Chen commented Feb 17, 2020

The first transaction in the latest dump file that IF released is from June 28, 2019 19:10:25
376G 1154472.dmp

In our own implementation, Scylladb takes similar disk usage as the dump files after importing.
More space is for the commit log.

divide dump files

Divide large dump files to 1 million transations per division.
1 million transactions dump files taks 2.6G space.

@YingHan-Chen
Copy link
Contributor Author

Detail of Compaction
How to choose
We use Size-tiered compaction strategy currently.

@YingHan-Chen
Copy link
Contributor Author

io-conf-configuration-for-hdd-storage

when using Scylla with HDD storage, it is recommended to use RAID0 on all of your available disks, and manually update the io.conf configuration file max-io-request parameter. This parameter sets the number of concurrent requests sent to the storage. The value for this parameter should be 3X (3 times) the number of your disks. For example, if you have 3 disks, you would set max-io-request=9.

@YingHan-Chen
Copy link
Contributor Author

YingHan-Chen commented Feb 18, 2020

Speed of scp from node8 to node1 to NAS

1154472.dmp_16  100% 2634MB  10.8MB/s   04:04    
1154472.dmp_85  100% 2634MB  10.8MB/s   04:03    
1154472.dmp_105  100% 2634MB  10.7MB/s   04:05    
1154472.dmp_118  100% 2634MB  10.7MB/s   04:05    
1154472.dmp_97 100% 2634MB  10.6MB/s   04:08    
1154472.dmp_86 100% 2634MB   7.3MB/s   06:01    
1154472.dmp_55 100% 2634MB   6.9MB/s   06:21    
1154472.dmp_63 100% 2634MB   7.2MB/s   06:05    
1154472.dmp_58 100% 2634MB   7.2MB/s   06:07    
1154472.dmp_56 100% 2634MB   7.2MB/s   06:07  
...
@YingHan-Chen YingHan-Chen changed the title DB : Emulate disk usage for Permanode with Scylladb Mar 3, 2020
@YingHan-Chen
Copy link
Contributor Author

YingHan-Chen commented Mar 3, 2020

I try to run seven importers parallelly.
Each importer imports different pieces.

The requests that ScyllaDB serverd increase as follwoing.
image
image

image

@YingHan-Chen
Copy link
Contributor Author

We are running two processes to get transaction data.

One importer process with 32 worker threads and one thread that reads from dump files and adds tasks to the task pool.

One listener process with 8 worker threads and one thread that listens from ZMQ sn event and adds tasks to the task pool.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
1 participant