已经很简单了,到这节课才明白了多实例的含义,底层都指向了一个数据:mysql实例
1)创建一个 mysql 实例,实例名为 myinst,使用的端口为 3306;
su - sdbadmin
cd /opt/sequoiasql/mysql
bin/sdb_sql_ctl addinst myinst -d database/3306/
sdb_sql_ctl start myinst
2)创建 metauser 用户并赋予全局权限,用于创建 sparksql 元数据库 metastore,存放元数据信息;
mysql -h127.0.0.1 -uroot
create user 'metauser'@'%' identified by 'metauser';
grant all on *.* to 'metauser'@'%';
flush privileges;
create database metastore character set 'latin1' collate 'latin1_bin';
note:
保持 mysql 实例用户 root 密码为空
3)在 mysql 实例中创建数据库 company,数据表 employee ( empno int, ename varchar(128), age int ),并写入如下数据,然后查询是否存在数据;
create database company;
use company;
create table employee (empno int,ename varchar(128),age int);
insert into employee values(10001,'georgi',48);
(empno:10001, ename:'georgi', age:48)
4)创建一个 postgresql 实例,实例名为 pginst,使用的端口为 5432,并与 sequoiadb 巨杉数据库对接;
cd /opt/sequoiasql/postgresql
bin/sdb_sql_ctl addinst pginst -d database/5432
sdb_sql_ctl start pginst
sdb_sql_ctl createdb company pginst
5)在 postgresql 实例中创建数据库 company,外部表 employee 与 mysql 实例创建的分区表对应,然后查询是否存在数据;
psql -p 5432 company
create extension sdb_fdw;
create server sdb_server foreign data wrapper sdb_fdw
options (address '127.0.0.1', service '11810', preferedinstance 'a', transaction 'on');
create foreign table employee
(
empno integer,
ename text,
age integer
) server sdb_server
options (collectionspace 'company', collection 'employee', decimal 'on');
6)安装部署 sparksql 实例,配置 sparksql 的元数据信息到 mysql 实例中,要求安装后 spark 的 home 目录为:/home/sdbadmin/spark-2.4.4-bin-hadoop2.7
note: sparksql 实例的安装包目录:/home/sdbadmin/soft
tar xzvf spark* -c /home/sdbadmin
7)sparksql 实例中创建数据库 company 和数据表 employee 与 mysql 实例中创建的数据库 company 和数据表 employee 对应,实现数据共享;
note:
使用 spark-sql 客户端进行 sparksql 的操作
ssh-keygen -t rsa
ssh-copy-id sdbadmin@sdbserver1
cd /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/conf
cp spark-env.sh.template spark-env.sh
echo "spark_master_host=sdbserver1" >> spark-env.sh
cat > /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/conf/hive-site.xml << eof
hive.metastore.schema.verification
false
javax.jdo.option.connectionurl
jdbc:mysql://localhost:3306/metastore?usessl=false
jdbc connect string for a jdbc metastore
javax.jdo.option.connectiondrivername
com.mysql.jdbc.driver
driver class name for a jdbc metastore
javax.jdo.option.connectionusername
metauser
javax.jdo.option.connectionpassword
metauser
datanucleus.autocreateschema
true
creates necessary schema on a startup if one doesn't exist. set this to false, after creating it once
eof
cp /opt/sequoiadb/spark/spark-sequoiadb_2.11-3.4.jar /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/jars/
cp /opt/sequoiadb/java/sequoiadb-driver-3.4.jar /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/jars/
cp /home/sdbadmin/soft/mysql-jdbc.jar /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/jars/
cp log4j.properties.template log4j.properties
sed -i 's/log4j.rootcategory=info, console/log4j.rootcategory=error, console/g' log4j.properties
cd /home/sdbadmin/spark-2.4.4-bin-hadoop2.7
sbin/start-all.sh
jps
bin/spark-sql
create database company;
use company;
create table company.employee
(
empno int,
ename string,
age int
) using com.sequoiadb.spark options (host 'localhost:11810', collectionspace 'company', collection 'employee', username '', password '');
select avg(age) from company.employee;
阅读(807) | 评论(0) | 转发(0) |