how to configure eclipse for developing with python and spark on hadoop
TRANSCRIPT
How to configure Eclipse for
developing with Python and
Spark on Hadoop
https://enahwe.wordpress.com/2015/11/25/how-to-configure-eclipse-for-developing-with-python-and-spark-on-hadoop
How to configure Eclipse for developing
with Python and Spark on Hadoop
Python is one of the most famous programming language used by Data Scientists who develop
programs in order to process Feature Engineering and Machine Learning algorithms.
However Spark (DataFrame and Machine Learning) enables Data Scientists who want to develop
in Python of raising their program's performances by using a Spark cluster.
But what about if Data Scientists want their projects in Python to be more industrial ?
There are many benefits for them to develop with an IDE like Eclipse in addition of developing in
web mode on notebook servers like Jupyter and Zeppelin.
This roadmap describes how to configure Eclipse V4.3 IDE with the PyDev V4.x+ plugin in order to
develop with Python V2.6 or higher, Spark V1.5 or Spark V1.6, and on Hadoop YARN.
https://enahwe.wordpress.com/2015/11/25/how-to-configure-eclipse-for-developing-with-python-and-spark-on-hadoop
How to configure Eclipse for developing
with Python and Spark on Hadoop
In this roadmap you will learn how to successfully lead the following topics:
• How to execute the basic Spark example code “Word Counts”
• How to read a CSV file directly as a Spark DataFrame for processing SQL
• How to execute your Python-Spark application on a cluster with Hadoop YARN
• How to deploy your Python-Spark application in a production environment
https://enahwe.wordpress.com/2015/11/25/how-to-configure-eclipse-for-developing-with-python-and-spark-on-hadoop