手动搭建Spark集群,已使用Ambari安装HDFS和YARN(2.7.1.2.4)

  • Install JDK, set $JAVA_HOME and add to $PATH

      $ sudo add-apt-repository ppa:webupd8team/java
      $ sudo apt-get update
      $ sudo apt-get install oracle-java8-installer
    
  • Put spark-2.0.0-preview-bin-hadoop2.7 under /opt/, set $SPARK_HOME and add to $PATH

  • $SPARK_HOME/conf/spark-env.sh

      HADOOP_CONF_DIR=$HADOOP_HOME/conf
      SPARK_MASTER_IP=<HOSTNAME OF YOUR MASTER NODE>
    
  • $SPARK_HOME/conf/spark-defaults.conf

      spark.master            spark://<HOSTNAME OF YOUR MASTER NODE>:7077
      spark.serializer        org.apache.spark.serializer.KryoSerializer
      spark.driver.extraJavaOptions -Dhdp.version=current    #For HDP
      spark.yarn.am.extraJavaOptions -Dhdp.version=current    #For HDP
    
  • $SPARK_HOME/conf/slaves

      <HOSTNAME OF YOUR MASTER NODE>
      <HOSTNAME OF YOUR SLAVE NODE 1>
      ...
      ...
      <HOSTNAME OF YOUR SLAVE NODE n>
    
  • Create java_opts file under $SPARK_HOME/conf

      -Dhdp.version=current
    
  • Set passwordless between all nodes

  • $SPARK_HOME/sbin/start-all.sh

运行sample程序验证:

spark-submit --class org.apache.spark.examples.SparkPi \
             --master yarn  \
             --deploy-mode cluster \
            /opt/spark-2.0.0-preview-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.0.0-preview.jar \
             1000


Published

27 June 2016

Category

Spark

Tags