After solving the PostgreSQL JSON to Spark RDD issue I stumbled on another just a day later. Namely I started geting a
java.lang.ClassNotFoundException: org.postgresql.Driver and had no idea what to do. Well, actually the cluster admin had also no idea!
After looking into this, learning the ropes of Zeppelin and Spark I finally figured out. We had been using the Dynamic Dependency loading via
%dep, documented at Spark Interpreter for Apache Zeppelin and it had started failing. Not sure why but probably some upgrade somewhere.
So if your Spark Interpeter is throwing a ClassNotFoundException then make sure you give it the required classes as explained in the previously linked doc. I did this through modifying the zeppelin/conf/zeppelin-env.sh and adding a SPARK_SUBMIT_OPTIONS.