Hi everyone! My name is Anthony Tong and I am a second year Computer Science student at UC Berkeley. I am very excited to join the Modin team and make some contributions. I would say that I am currently most proficient in Python, Java, C, and SQL, but I am certainly willing and able to pick up new languages and frameworks if needed. I understand that the main tasks at hand are profiling and implementing pandas functions. I think I would slightly lean towards implementing pandas functions, but I would definitely be willing to tackle either of the two tasks (or any future tasks that may arise). I look forward to getting to know the other team members and contributing to Modin. Thanks!
Hi Anthony, great to have you! Most of this is Python work, so it should be relatively straight-forward given your background.
The implementations that are highest priority are related to I/O, and we need to make sure we are benchmarking existing bottlenecks to see if there is a simple way to mitigate them. Thanks again and welcome!
Great to have you on board! It be great if you could implement
to_parquet. Here are some of the relevant links. Be sure to use pyarrow and only import it when you need it.
to_parquet using pyarrow: https://github.com/pandas-dev/pandas/blob/1700680381bdbfbc1abe9774f96881801b24d6ca/pandas/io/parquet.py#L76-L136
If you have any questions, just let us know!
I’ve been working on setting up the project and downloading all the necessary frameworks for Modin. I was just wondering what environment you use/recommend. Currently, I’ve been trying to use PyCharm but have had some issues with relative imports and whatnot.
I personally use either vim or atom as my IDE and manage my python environments with
virtualenv. However, I think it should be fine to use PyCharm when working on modin since its all in python.
Ok awesome. Thanks for the tips!
Hi @Anthony_Tong I’ve been developing on PyCharm with
virtualenv. It isn’t the most straightforward process but I can try to help.
No worries. I ended up setting it up using atom because I didn’t want to deal with the different folders and import problems. Thanks anyways though and I really appreciate how helpful everyone is.