sbt-assembly使用简介

在Spark和Spark Streaming项目中经常涉及到外部依赖包的部署问题,比较简便的方式是将项目编译的class和依赖包打到一个jar包中,方便上传部署,scala项目使用sbt-assembly来将工程class和依赖打到一个jar包中,类似maven的assembly。参考sbt-assembly项目地址:https://github.com/sbt/sbt-assembly

安装sbt-assembly

在plugins.sbt(项目根目录/project/plugins.sbt)文件中添加以下配置,用于安装插件,并指定依赖下载地址:
addSbtPlugin(“com.eed3si9n” % “sbt-assembly” % “0.14.4”)
resolvers += Resolver.url(“bintray-sbt-plugins”, url(“http://dl.bintray.com/sbt/sbt-plugin-releases”))(Resolver.ivyStylePatterns);

sbt-assembly版本的选择
sbt的版本是sbt 0.13.6+,选择0.14.4
sbt 0.13.x选择0.11.2
sbt 0.12选择0.9.2
如何查看sbt版本:
在命令行输入sbt进入sbt的命令模式,执行命令:
sbtVersion

排除jar包

sbt-assembly是根据项目配置的libraryDependencies依赖进行打包的,不需要打包的依赖可以设置”provided”进行排除
[build.sbt]
libraryDependencies += “org.apache.spark” % “spark-core_2.11” % “2.1.0” % “provided”
排除scala库的jar包
在项目根目录下创建assembly.sbt文件并添加以下配置(注:sbt-assembly相关的配置,可以配置在项目根目录/build.sbt中,也可以在项目根目录下的assembly.sbt文件中):
[assembly.sbt]
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
明确排除某一指定jar包
[assembly.sbt]
assemblyExcludedJars in assembly := {
  val cp = (fullClasspath in assembly).value
  cp filter {_.data.getName == “compile-0.1.0.jar”}
}
使用sbt进行编译打包:
sbt clean compile assembly

Jar包冲突

当遇到jar包冲突时,需要设置assemblyMergeStrategy
例如有冲突报错信息如下:
[error] 1 error was encountered during merge
java.lang.RuntimeException: deduplicate: different file contents found in the following:
/Users/xueyintao/.ivy2/cache/org.apache.spark/spark-streaming-kafka-0-10_2.11/jars/spark-streaming-kafka-0-10_2.11-2.1.0.jar:org/apache/spark/unused/UnusedStubClass.class
/Users/xueyintao/.ivy2/cache/org.apache.spark/spark-tags_2.11/jars/spark-tags_2.11-2.1.0.jar:org/apache/spark/unused/UnusedStubClass.class
/Users/xueyintao/.ivy2/cache/org.spark-project.spark/unused/jars/unused-1.0.0.jar:org/apache/spark/unused/UnusedStubClass.class
        at sbtassembly.Assembly$.applyStrategies(Assembly.scala:140)

报错信息表明在3个jar包中出现了相同声明的类:org/apache/spark/unused/UnusedStubClass.class

添加assemblyMergeStrategy配置来解决:
[assembly.sbt]
assemblyMergeStrategy in assembly := {
  case PathList(ps @ _*) if ps.last endsWith “UnusedStubClass.class” => MergeStrategy.first
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}
配置是当遇到类路径以UnusedStubClass.class结尾时,只保留第一个,更多配置参数查看官方文档根据项目具体情况进行配置。

其他使用场景

将依赖和项目的class分开打成两个包
将所有依赖包打到一个包中,而不包含自己项目的class:
sbt assemblyPackageDependency
会生成以-deps.jar结尾的jar包
将自己的项目打成一个包:
[assembly.sbt]
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false, includeDependency = false)
使用命令sbt assembly打包

发表评论

电子邮件地址不会被公开。 必填项已用*标注