确保Java已正确安装并配置好环境变量。
下载并解压Apache Spark的二进制文件。将解压后的文件夹移动到一个合适的目录中。
打开控制台,使用cd命令进入到Spark文件夹中。
在Spark的bin目录下,使用以下命令启动Spark: ./sbin/start-all.sh
如果出现以下错误: WARN Utils: Your hostname, ubuntu-VirtualBox resolves to a loopback address: 127.0.1.1; using 192.168.1.3 instead (on interface enp0s3) WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/path/to/spark/jars/spark-unsafe_2.11-2.4.0.jar) to method java.nio.Bits.unaligned() WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release
可以在Spark的conf目录下找到log4j.properties.template文件,将其复制重命名为log4j.properties。
解决Java运行时错误,可以通过添加以下代码来解决: JavaSparkContext sparkContext = new JavaSparkContext(conf); sparkContext.hadoopConfiguration().set("fs.s3a.access.key",accessKey); sparkContext.hadoopConfiguration().set("fs.s3a.secret.key",secretKey); sparkContext.hadoopConfiguration().set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); sparkContext.hadoopConfiguration().set( "com.amazonaws.services.s3.enableV4", "true");
最后,检查Spark web UI是否正确显示:在浏览器中打开http://localhost:4040/网址。如果展示了Spark应用程序,则说明安装过程已经顺利完成。