Sno. | PYTHON | PYSPARK |
1 | Python is an interpreter high level language for several purpose programming. | Pyspark is the python shell of spark. i e Pyspark is the interface that give access to Spark using Python |
2 | It is slower compared to pyspark. | It is 10 times faster than Python |
3 | Comparatively easier to learn for Java programmers because of syntax and standard libraries. | Pyspark arcane syntax makes it difficult to master(Verbose Language) |
4 | It is dynamically typed language. So it is less safer compared to Pyspark | It is statistically typed language. So it is safer than Python |
5 | Programs written in python cannot be submitted to a spark cluster and runs locally. | Program written in pyspark can be submitted to a spark cluster and run in a distributed manner. |
6 | There are also inbuilt packages and libraries available with python which are also available with Pyspark mostly. | It is thought of as a set of libraries, since there are more sub packages in Pyspark like spark, SQL, spark ML etc |
7 | Python works like an interpreter | In Pyspark, python is only a scripting front end, i.e., no interpreted Python code is executed once the spark job starts |
8 | Waste lots of memory (especially in case of iterations) | It doesn't waste memory. It Creates a counter value one by one. |
9 | Python does support heavy weight process forking using WSGI but it does not support true multi-threading. | Supports powerful concurrency through primitives like Akka's actors |
10 | RDD operations cannot be done | RDD operations can be done |
Author:
A.Yoga Sai Satwik
Noble John Paul
No comments:
Write comments