I am running pyspark in my PC (windows 10) but I can not import HiveContext:
(我在PC(Windows 10)中运行pyspark,但无法导入HiveContext:)
from pyspark.sql import HiveContext
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-25-e3ae767de910> in <module>
----> 1 from pyspark.sql import HiveContext
ImportError: cannot import name 'HiveContext' from 'pyspark.sql' (C:sparkspark-3.0.0-preview-bin-hadoop2.7pythonpysparksql\__init__.py)
The following code is also returning an exception:
(以下代码也返回异常:)
from pyspark.sql import SparkSession
spark = SparkSession
.builder
.appName("Python Spark create RDD example")
.config("spark.some.config.option", "some-value")
.getOrCreate()
# Create RDDs from an existing Hive repository
hive_ctx = spark.builder.enableHiveSupport()
hive_lines = hive_ctx.sql("SELECT name, age FROM users WHERE age > 18")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-33-a4b672533a95> in <module>
1 # Create RDDs from an existing Hive repository
2 hive_ctx = spark.builder.enableHiveSupport()
----> 3 hive_lines = hive_ctx.sql("SELECT name, age FROM users WHERE age > 18")
AttributeError: 'Builder' object has no attribute 'sql'
as well as this:
(以及这个:)
# Create RDDs from an existing Hive repository
hive_ctx = spark.builder.enableHiveSupport(sc)
hive_lines = hive_ctx.sql("SELECT name, age FROM users WHERE age > 18")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-34-883015e73112> in <module>
1 # Create RDDs from an existing Hive repository
----> 2 hive_ctx = spark.builder.enableHiveSupport(sc)
3 hive_lines = hive_ctx.sql("SELECT name, age FROM users WHERE age > 18")
TypeError: enableHiveSupport() takes 1 positional argument but 2 were given
How I should correct the code?
(我应该如何纠正代码?)
ask by user8270077 translate from so 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…