SeaTunnel Setup

Standalone mode setup

SeaTunnel standalone mode setup.


Choose a spark version from here to download, for instance (currently SeaTunnel only accepts Spark 2.0):


Next, extract the saved archive using tar:

tar xvf spark-*

And mv command:

sudo mv spark-* /opt/spark

Now we are going to configure Spark environment using echo:

echo "export SPARK_HOME=/opt/spark" >> ~/.profile
echo "export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin" >> ~/.profile
echo "export PYSPARK_PYTHON=/usr/bin/python3" >> ~/.profile

and load it:

source ~/.profile

We can now start standalone Spark Master Server by:

cd /opt/spark/sbin

Then we shall see Spark web user interface on http://localhost:8080/.

Next, we need to have a slave server to run:

./ spark://localhost:7077

To test Spark shell:


:q to exit Scala:


Other basic commands:







If we're running Ubuntu on WSL, and we may see localhost: ssh: connect to host localhost port 22: Connection refused this error while trying to start a spark worker, we shall generate a new ssh key for localhost. According to this:

If openssh-server not installed:

sudo apt-get upgrade
sudo apt-get update
sudo apt-get install openssh-server
sudo service ssh start

Take the following steps to enable ssh for localhost:

cd ~/.ssh
ssh-keygen                          # generate a public/private rsa key pair; use the default options
cat >> authorized_keys   # to append the key to the authorized_keys file
chmod 640 authorized_keys           # to set restricted permissions
sudo service ssh restart            # to pickup recent changes
ssh localhost


Download SeaTunnel:

export version="2.1.2"
wget "${version}/apache-seatunnel-incubating-${version}-bin.tar.gz"
tar -xzvf "apache-seatunnel-incubating-${version}-bin.tar.gz"

Test demo (spark.streaming.conf.template only works on Spark cluster):

cd "apache-seatunnel-incubating-${version}"
./bin/ \
--master local[4] \
--deploy-mode client \
--config ./config/spark.batch.conf.template


