Anthony Tong - Introduction

#1

Hi everyone! My name is Anthony Tong and I am a second year Computer Science student at UC Berkeley. I am very excited to join the Modin team and make some contributions. I would say that I am currently most proficient in Python, Java, C, and SQL, but I am certainly willing and able to pick up new languages and frameworks if needed. I understand that the main tasks at hand are profiling and implementing pandas functions. I think I would slightly lean towards implementing pandas functions, but I would definitely be willing to tackle either of the two tasks (or any future tasks that may arise). I look forward to getting to know the other team members and contributing to Modin. Thanks!

3 Likes
#2

Hi Anthony, great to have you! Most of this is Python work, so it should be relatively straight-forward given your background.

The implementations that are highest priority are related to I/O, and we need to make sure we are benchmarking existing bottlenecks to see if there is a simple way to mitigate them. Thanks again and welcome!

#3

Hi @Anthony_Tong,

Great to have you on board! It be great if you could implement to_parquet. Here are some of the relevant links. Be sure to use pyarrow and only import it when you need it.

Modin’s read_parquet: https://github.com/modin-project/modin/blob/17b7fccb28cf525bf1abd1a7be979c4cb5b66688/modin/engines/ray/pandas_on_ray/io.py#L31-L93

Pandas to_parquet using pyarrow: https://github.com/pandas-dev/pandas/blob/1700680381bdbfbc1abe9774f96881801b24d6ca/pandas/io/parquet.py#L76-L136

Be sure to add new tests as well that follow the new structure introduced here.

If you have any questions, just let us know!

#4

Hi @williamma12,

I’ve been working on setting up the project and downloading all the necessary frameworks for Modin. I was just wondering what environment you use/recommend. Currently, I’ve been trying to use PyCharm but have had some issues with relative imports and whatnot.

Thanks!

#5

I personally use either vim or atom as my IDE and manage my python environments with virtualenv. However, I think it should be fine to use PyCharm when working on modin since its all in python.

#6

Ok awesome. Thanks for the tips!

#7

Hi @Anthony_Tong I’ve been developing on PyCharm with virtualenv. It isn’t the most straightforward process but I can try to help.

#8

No worries. I ended up setting it up using atom because I didn’t want to deal with the different folders and import problems. Thanks anyways though and I really appreciate how helpful everyone is.