hive 安装 (derby和mysql)

一、准备

hive 与 hadoop 对应版本：

https://hive.apache.org/general/downloads/

1.下载安装 jdk-8u341-linux-x64.tar.gz
2.下载安装 hadoop-2.10.2.tar.gz
3.下载 apache-hive-2.3.9-bin.tar.gz

二、安装 hive

1.解压

把 apache-hive-2.3.9-bin.tar.gz 解压到 /opt/ 目录下

tar -xf apache-hive-2.3.9-bin.tar.gz -C /opt/

2.修改配置文件的 HADOOP_HOME

把配置文件 hive-env.sh.template，重命名为 hive-env.sh，并配置 HADOOP_HOME

# Set HADOOP_HOME to point to a specific hadoop install directory
# HADOOP_HOME=${bin}/../../hadoop
HADOOP_HOME=/opt/hadoop-2.10.2

3.初始化数据库 derby

删除 derby 数据目录：

root@node0001:/opt/apache-hive-2.3.9-bin# rm -rf metastore_db/

初始化 derby 数据库：

root@node0001:/opt/apache-hive-2.3.9-bin# bin/schematool -initSchema -dbType derby
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.9-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.10.2/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:    jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :    org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:   APP
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.derby.sql
Initialization script completed
schemaTool completed

三、使用

1.启动 hive

root@node0001:/opt/apache-hive-2.3.9-bin# bin/hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.9-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.10.2/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/apache-hive-2.3.9-bin/lib/hive-common-2.3.9.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

2.使用 hive

hive> show databases;
OK
default
Time taken: 3.052 seconds, Fetched: 1 row(s)
hive> use default;
OK
Time taken: 0.021 seconds
hive> show tables;
OK
Time taken: 0.042 seconds
hive> create table student(id int, name string);
OK
Time taken: 0.407 seconds
hive> show tables;
OK
student
Time taken: 0.023 seconds, Fetched: 1 row(s)
hive> desc student;
OK
id                      int
name                    string
Time taken: 0.102 seconds, Fetched: 2 row(s)
hive> insert into student values(1,"zhangsan");
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20231013173034_61e5db72-71c1-4ee1-b817-4a392fffe16d
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-10-13 17:30:39,160 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local1229433203_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://node0001:9000/user/hive/warehouse/student/.hive-staging_hive_2023-10-13_17-30-34_835_889218708810284417-1/-ext-10000
Loading data to table default.student
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 11 HDFS Write: 93 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 4.763 seconds
hive> insert into student values(2,"lisi");
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20231013173940_edca02cd-4fe6-4b90-beab-720bd084cf4e
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-10-13 17:39:42,492 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local2131074738_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://node0001:9000/user/hive/warehouse/student/.hive-staging_hive_2023-10-13_17-39-40_232_8370877040343431751-1/-ext-10000
Loading data to table default.student
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 18 HDFS Write: 85 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 3.009 seconds
hive> select * from student;
OK
1   zhangsan
2   lisi
Time taken: 0.151 seconds, Fetched: 2 row(s)
hive> quit;

数据位置：

root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# ls -lh
total 285M
-rw-r--r-- 1 root root 128M Oct 13 17:19 blk_1073741825
-rw-r--r-- 1 root root 1.1M Oct 13 17:19 blk_1073741825_1001.meta
-rw-r--r-- 1 root root  14M Oct 13 17:19 blk_1073741826
-rw-r--r-- 1 root root 107K Oct 13 17:19 blk_1073741826_1002.meta
-rw-r--r-- 1 root root 128M Oct 13 17:19 blk_1073741827
-rw-r--r-- 1 root root 1.1M Oct 13 17:19 blk_1073741827_1003.meta
-rw-r--r-- 1 root root  14M Oct 13 17:19 blk_1073741828
-rw-r--r-- 1 root root 107K Oct 13 17:19 blk_1073741828_1004.meta
-rw-r--r-- 1 root root   11 Oct 13 17:30 blk_1073741830
-rw-r--r-- 1 root root   11 Oct 13 17:30 blk_1073741830_1006.meta
-rw-r--r-- 1 root root    7 Oct 13 17:39 blk_1073741833
-rw-r--r-- 1 root root   11 Oct 13 17:39 blk_1073741833_1009.meta
root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# 
root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# cat blk_1073741830
1zhangsan
root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# cat blk_1073741833
2lisi

四、mysql 存储元数据

1.安装 mysql 8.0
2.下载 mysql-connector-java-8.0.30.jar

1.放入驱动

mysql驱动放入hive

root@node0001:~# cp mysql-connector-java-8.0.30.jar /opt/apache-hive-2.3.9-bin/lib/

2.创建元数据库

mysql 创建 metastore 库，存放 hive 元数据库

mysql> create database metastore;
Query OK, 1 row affected (0.00 sec)
mysql> quit;

3.hive 配置

root@node0001:/opt/apache-hive-2.3.9-bin/conf# touch hive-site.xml

hive-site.xml 如下：

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <!-- jdbc连接的URL -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://127.0.0.1:3306/metastore?useSSL=false</value>
    </property>

    <!-- jdbc连接的Driver-->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
         <value>com.mysql.cj.jdbc.Driver</value>
    </property>

    <!-- jdbc连接的username-->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>

    <!-- jdbc连接的password -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>123456</value>
    </property>

    <!-- Hive默认在HDFS的工作目录 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
</configuration>

4.初始化元数据

使用 bin/schematool -initSchema -dbType mysql，初始化元数据

root@node0001:/opt/apache-hive-2.3.9-bin# bin/schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/apache-hive-2.3.9-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.10.2/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:    jdbc:mysql://127.0.0.1:3306/metastore?useSSL=false
Metastore Connection Driver :    com.mysql.cj.jdbc.Driver
Metastore connection User:   root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed

5.使用

root@node0001:/opt/apache-hive-2.3.9-bin# bin/hive
hive> show databases;
OK
default
Time taken: 5.964 seconds, Fetched: 1 row(s)
hive> show tables;
OK
Time taken: 0.085 seconds
hive> create table stu(id int, name string);
OK
Time taken: 1.074 seconds
hive> insert into stu values(3,"wangwu");
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20231013183233_4c43800c-a75d-48c5-8e86-cb81bc20c7d4
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-10-13 18:32:37,254 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local2081344082_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://node0001:9000/user/hive/warehouse/stu/.hive-staging_hive_2023-10-13_18-32-33_348_4173909191419990915-1/-ext-10000
Loading data to table default.stu
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 9 HDFS Write: 85 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 4.569 seconds
hive> insert into stu values(4,"zhaoliu");
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = root_20231013183256_1489e679-c84b-434a-9f70-fb7e9e46377d
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Job running in-process (local Hadoop)
2023-10-13 18:32:57,753 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_local706844254_0002
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://node0001:9000/user/hive/warehouse/stu/.hive-staging_hive_2023-10-13_18-32-56_124_8498680797194661419-1/-ext-10000
Loading data to table default.stu
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 86 HDFS Write: 172 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 2.194 seconds
hive> select * from stu;
OK
3   wangwu
4   zhaoliu
Time taken: 0.234 seconds, Fetched: 2 row(s)
hive> quit;

查看数据：

root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# ls -lh
total 285M
-rw-r--r-- 1 root root 128M Oct 13 17:19 blk_1073741825
-rw-r--r-- 1 root root 1.1M Oct 13 17:19 blk_1073741825_1001.meta
-rw-r--r-- 1 root root  14M Oct 13 17:19 blk_1073741826
-rw-r--r-- 1 root root 107K Oct 13 17:19 blk_1073741826_1002.meta
-rw-r--r-- 1 root root 128M Oct 13 17:19 blk_1073741827
-rw-r--r-- 1 root root 1.1M Oct 13 17:19 blk_1073741827_1003.meta
-rw-r--r-- 1 root root  14M Oct 13 17:19 blk_1073741828
-rw-r--r-- 1 root root 107K Oct 13 17:19 blk_1073741828_1004.meta
-rw-r--r-- 1 root root   11 Oct 13 17:30 blk_1073741830
-rw-r--r-- 1 root root   11 Oct 13 17:30 blk_1073741830_1006.meta
-rw-r--r-- 1 root root    7 Oct 13 17:39 blk_1073741833
-rw-r--r-- 1 root root   11 Oct 13 17:39 blk_1073741833_1009.meta
-rw-r--r-- 1 root root    9 Oct 13 18:32 blk_1073741836
-rw-r--r-- 1 root root   11 Oct 13 18:32 blk_1073741836_1012.meta
-rw-r--r-- 1 root root   10 Oct 13 18:32 blk_1073741839
-rw-r--r-- 1 root root   11 Oct 13 18:32 blk_1073741839_1015.meta
root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# cat blk_1073741836
3wangwu
root@node0001:/var/hadoop/pseudo/dfs/data/current/BP-795811933-172.17.0.2-1697188224715/current/finalized/subdir0/subdir0# cat blk_1073741839
4zhaoliu

6.查看元数据

查看元数据库中存储的库信息 select * from DBS
查看元数据库中存储的表信息 select * from TBLS
查看元数据库中存储的表中列相关信息 select * from COLUMNS_V2

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| metastore          |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)
mysql> use metastore;
mysql> select * from DBS;
+-------+-----------------------+------------------------------------------+---------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                          | NAME    | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+------------------------------------------+---------+------------+------------+
|     1 | Default Hive database | hdfs://node0001:9000/user/hive/warehouse | default | public     | ROLE       |
+-------+-----------------------+------------------------------------------+---------+------------+------------+
1 row in set (0.00 sec)

mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+----------------------------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED                     |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+----------------------------------------+
|      1 |  1697193100 |     1 |                0 | root  |         0 |     1 | stu      | MANAGED_TABLE | NULL               | NULL               | 0x00                                   |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+----------------------------------------+
1 row in set (0.00 sec)

mysql> select * from COLUMNS_V2;
+-------+---------+-------------+-----------+-------------+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
+-------+---------+-------------+-----------+-------------+
|     1 | NULL    | id          | int       |           0 |
|     1 | NULL    | name        | string    |           1 |
+-------+---------+-------------+-----------+-------------+
2 rows in set (0.00 sec)

近期文章

近期评论

hive · 2022-11-30 0

hive 安装 (derby和mysql)

一、准备

二、安装 hive

1.解压

2.修改配置文件的 HADOOP_HOME

3.初始化数据库 derby

三、使用

1.启动 hive

2.使用 hive

四、mysql 存储元数据

1.放入驱动

2.创建元数据库

3.hive 配置

4.初始化元数据

5.使用

6.查看元数据

您可能还喜欢...

发表回复取消回复

hive · 2022-11-30 0

一、准备

二、安装 hive

1.解压

2.修改配置文件的 HADOOP_HOME

3.初始化数据库 derby

三、使用

1.启动 hive

2.使用 hive

四、mysql 存储元数据

1.放入驱动

2.创建元数据库

3.hive 配置

4.初始化元数据

5.使用

6.查看元数据

您可能还喜欢...

docker 启动 ftp 服务

clickhouse jdbc 升级，sql 参数有 LocalDateTime，执行查询失败分析

Linux查找命令

发表回复 取消回复

发表回复取消回复