在Mac OS Sierra本地搭了cdh版本spark 2.1.0.cloudera2hadoop-2.6.0-cdh5.13.3,但一直报snappy错误:

org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null

at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
	at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
	at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)
	at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)
	at java.io.DataInputStream.readFully(DataInputStream.java:195)
	at java.io.DataInputStream.readFully(DataInputStream.java:169)
...
...
...
WARN scheduler.TaskSetManager: Lost task 0.2 in stage 1.0 (TID 3, 10.100.44.114, executor 1): java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
	at parquet.hadoop.codec.SnappyDecompressor.decompress(SnappyDecompressor.java:62)
	at parquet.hadoop.codec.NonBlockedDecompressorStream.read(NonBlockedDecompressorStream.java:51)
	at java.io.DataInputStream.readFully(DataInputStream.java:195)
	at java.io.DataInputStream.readFully(DataInputStream.java:169)
...
...

查阅了网上很多说法都是把snapy的jar包加到spark的classpath云云,然而各种加classpath都试了还是不行。最后看到一个取巧的办法:

An easy fix if you want to continue to use 1.4.0.1 on MacOSX with JDK 1.7

unzip snappy-java-1.0.4.1.jar
cd org/xerial/snappy/native/Mac/x86_64/
copy libsnappyjava.jnilib libsnappyjava.dylib
cd ../../../../../..
cp snappy-java-1.0.4.1.jar snappy-java-1.0.4.1.jar.old
jar cf snappy-java-1.0.4.1.jar org

修改压缩jar包即可解决。之前在IDEA里面直接跑example的时候遇到snappy问题,只需要把libsnappyjava.jnilib这个文件copy到项目根目录即可,但这次尝试了copy到$SPARK_HOME却依然不行,不知道改成.dylib再拷过去是否有效。



Published

26 July 2018

Category

Work Note

Tags