Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create separately "my_dataset.raw.train.txt" files for each java codes? #149

Open
brcsnt opened this issue Mar 8, 2022 · 3 comments

Comments

@brcsnt
Copy link

brcsnt commented Mar 8, 2022

Hello,

Thank you again for sharing this project with us! I have a question.
When I run the "Preprocess.sh" script, all the java files in train folder union all and create a "my_dataset.raw.train.txt" file.
Is it possible to create the "my_dataset.raw.train.txt" files as a separate files for each java code in train folder ? (For instance my_dataset.raw.train_code_1.txt , my_dataset.raw.train_code_2.txt, etc.)

Many thanks in advance.

@urialon
Copy link
Collaborator

urialon commented Mar 8, 2022 via email

@brcsnt
Copy link
Author

brcsnt commented Mar 10, 2022

Hello @urialon,

Thank you so much for your quick response.
I did some research to make changes to the JavaExtractor project. Do I have to modify exactly the "extract.py" file to get the result I specified above? I tried to make some changes but so far without success.

Many thanks in advance.

@urialon
Copy link
Collaborator

urialon commented Mar 10, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants