TianYong's Blog

比你优秀的人都努力，有什么理由不努力！

hadoop实验课-hive

发表于 2020-06-09 | 分类于 Hadoop大数据技术 | 浏览次

字数统计: 426 | 阅读时长 ≈ 1

Hive

命令后要加‘；‘

创建表

1	create table emp001(empno int,ename string,job string,mgr int,hiredate string,sal int,comm int,deptno int) row format delimited fields terminated by ',';

从hdfs导入数据到hive表

1	load data inpath '/001/hive/emp.csv' into table emp001;

建立分区

1	create table emp_part001(empno int,ename string,job string,mgr int,hiredate string,sal int,comm int) partitioned by (deptno int)row format delimited fields terminated by ',';

向分区中导入数据

建立分区的作用：建立分区可以理解为给hive表建立了一个索引，查询hive表时可以以分区作为条件，而不需要遍历整张表

1
2
3

insert into table emp_part001 partition(deptno=10) select empno,ename,job,mgr,hiredate,sal,comm from emp001 where deptno=10;
insert into table emp_part001 partition(deptno=20) select empno,ename,job,mgr,hiredate,sal,comm from emp001 where deptno=20;
insert into table emp_part001 partition(deptno=30) select empno,ename,job,mgr,hiredate,sal,comm from emp001 where deptno=30;

建立桶表

建立桶表的作用：对于每一个表（table）或者分区， Hive可以进一步组织成桶，也就是说桶是更为细粒度的数据范围划分。Hive也是针对某一列进行桶的组织。Hive采用对列值哈希，然后除以桶的个数求余的方式决定该条记录存放在哪个桶当中。所谓Hive中的分桶，实际就是指的MapReduce中的分区。根据Reduce的数量，分成不同个数的文件

1	create table emp_bucket001(empno int,ename string,job string,mgr int,hiredate string,sal int,comm int,deptno int)clustered by (job) into 4 buckets row format delimited fields terminated by ',';

常用操作

查看hive表的设计

1	describe tablename;

查看hive中的所有数据库

1	show databases;

查看hive中的所有表

先要使用具体某一个数据库

1 2	use databaseName; show tables;

本文标题:hadoop实验课-hive

文章作者:TTYONG

发布时间:2020年06月09日 - 13:06

最后更新:2023年06月04日 - 15:06

原始链接:http://tianyong.fun/hadoop%E5%A4%A7%E6%95%B0%E6%8D%AE%E6%8A%80%E6%9C%AF%E4%B8%8E%E5%BA%94%E7%94%A8-hive(%E5%AE%9E%E9%AA%8C%E8%AF%BE).html

许可协议: 转载请保留原文链接及作者。

多少都是爱

0%