我正在使用Spark SQL解析JSON并且它工作得非常好,它找到了架构并且我正在使用它进行查询.
现在我需要"平放"JSON,我已经在论坛中读到最好的方法是使用Hive(Lateral View)进行爆炸,所以我试着用它来做同样的事情.但我甚至无法创建上下文... Spark给我一个错误,我找不到如何解决它.
正如我所说,此时我只想创建de context:
println ("Create Spark Context:") val sc = new SparkContext( "local", "Simple", "$SPARK_HOME") println ("Create Hive context:") val hiveContext = new HiveContext(sc)
它给了我这个错误:
Create Spark Context: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/12/26 15:13:44 INFO Remoting: Starting remoting 15/12/26 15:13:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.80.136:40624] Create Hive context: 15/12/26 15:13:50 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 15/12/26 15:13:50 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored 15/12/26 15:13:56 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 15/12/26 15:13:56 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 15/12/26 15:13:58 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 15/12/26 15:13:58 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 15/12/26 15:13:59 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 15/12/26 15:14:01 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored 15/12/26 15:14:01 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) at org.apache.spark.sql.hive.client.IsolatedClientLoader.(IsolatedClientLoader.scala:179) at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185) at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:174) at org.apache.spark.sql.hive.HiveContext. (HiveContext.scala:177) at pebd.emb.Bicing$.main(Bicing.scala:73) at pebd.emb.Bicing.main(Bicing.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) Caused by: java.lang.OutOfMemoryError: PermGen space Process finished with exit code 1
我知道这是一个非常简单的问题,但我真的不知道这个错误的原因.提前感谢大家.
这是例外的相关部分:
Caused by: java.lang.OutOfMemoryError: PermGen space
您需要增加为JVM提供的PermGen内存量.默认情况下(SPARK-1879),Spark自己的启动脚本将此值增加到128 MB,所以我认为你必须在IntelliJ运行配置中做类似的事情.尝试添加-XX:MaxPermSize=128m
到"VM选项"列表.