Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

preprocess.py 623 B

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  1. import argparse
  2. import pandas as pd
  3. import numpy as np
  4. if __name__ == '__main__':
  5. parser = argparse.ArgumentParser()
  6. parser.add_argument("--dataset", type=str, required=True, help="Path to train dataset")
  7. parser.add_argument("--out_train", type=str, required=True, help="Column with classname")
  8. parser.add_argument("--out_test", type=str, required=True, help="Column with classname")
  9. args = parser.parse_args()
  10. df = pd.read_csv(args.dataset)
  11. train, test = np.split(df.sample(frac=1), [int(.6*len(df))])
  12. train.to_csv(args.out_train, index=False)
  13. test.to_csv(args.out_test, index=False)
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...