I'm new to pyspark and spark-nlp, I've been having quite a lot of issues with it when trying to run Python script in Pycharm within anaconda env, I happened see this page: https://github.com/JohnSnowLabs/spark-nlp/discussions/1022
It has some steps about how to correctly install Spark NLP on Windows 10, I'm not sure if I need to follow it:
Download winutils and put it in C:hadoopin
Download Apache Spark 2.4.6 and extract it in C:spark
I'm very confused now, because before seeing this article, I got an error py4j.protocol.Py4JJavaError: An error occurred while calling o314.load.: java.lang.ClassNotFoundException: com.johnsnowlabs.nlp.` in Pycharm (see this question: java.lang.ClassNotFoundException: com.johnsnowlabs.nlp.DocumentAssembler spark in Pycharm with conda env) and I wonder what's the right thing to do now...can someone help me please, thanks.
2.1m questions
2.1m answers
60 comments
57.0k users