Note that, in an HA cluster, the Standby NameNode also
performs checkpoints of the namespace state, and thus it
is not necessary to run a Secondary NameNode,
CheckpointNode, or BackupNode in an HA cluster. In fact,
to do so would be an error. This also allows one who is
reconfiguring a non-HA-enabled HDFS cluster to be
HA-enabled to reuse the hardware which they had
previously dedicated to the Secondary NameNode.
<!-- mapreduce --> <property> <name>mapreduce.map.memory.mb</name> <value>256</value> <description>memory allocation for map task, which should between minimum container and maximum</description> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>256</value> <description>memory allocation for reduce task, which should between minimum container and maximum</description> </property> <!-- mapreduce -->
<!-- <property> <name>hive.querylog.location</name> <value>/user/hive/logs</value> <description>Location of Hive run time structured log file</description> </property> 这里应该不用设置,log放在本地文件系统更合适吧 -->
hive>createtable if notexists words(id INT, word STRING) row format delimited fields terminated by " " lines terminated by "\n"; hive> load data local inpath "/opt/hive-test.txt" overwrite into table words; hive>select*from words;
for jar in `ls $TEZ_HOME | grep jar`; do export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/$jar done for jar in `ls $TEZ_HOME/lib`; do export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/lib/$jar done # this part could be replaced with line bellow export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/*:$TEZ_HOME/lib/* # `hadoop-env.sh`中说`HADOOP_CLASSPATH`是Extra Java CLASSPATH # elements # 这意味着hadoop组件只需要把其jar包加到`HADOOP_CLASSPATH`中既可
import org.apache.spark.sql.SQLContext val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext.sql("select * from words").collect().foreach(println) sqlContext.sql("select id, word from words order by id").collect().foreach(println)
sqlContext.sql("insert into words values(7, \"jd\")") val df = sqlContext.sql("select * from words"); df.show()
var df = spark.read.json("file:///opt/spark/example/src/main/resources/people.json") df.show()
tickTime=2000 # The number of milliseconds of each tick initLimit=10 # The number of ticks that the initial # synchronization phase can take syncLimit=5 # The number of ticks that can pass between # sending a request and getting an acknowledgement dataDir=/tmp/zookeeper/zkdata dataLogDir=/tmp/zookeeper/zkdatalog # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. clientPort=2181 # the port at which the clients will connect
autopurge.snapRetainCount=3 # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # The number of snapshots to retain in dataDir autopurge.purgeInterval=1 # Purge task interval in hours # Set to "0" to disable auto purge feature
server.0=hd-master:2888:3888 server.1=hd-slave1:2888:3888 server.2=hd-slave2:2888:3888 # Determine the zookeeper servers # fromation: server.NO=HOST:PORT1:PORT2 # PORT1: port used to communicate with leader # PORT2: port used to reelect leader when current leader fail
agent1.channels.ch1.type=memory # define a memory channel called `ch1` on `agent1` agent1.sources.avro-source1.channels=ch1 agent1.sources.avro-source1.type=avro agent1.sources.avro-source1.bind=0.0.0.0 agent1.sources.avro-source1.prot=41414 # define an Avro source called `avro-source1` on `agent1` and tell it agent1.sink.log-sink1.channels=ch1 agent1.sink.log-sink1.type=logger # define a logger sink that simply logs all events it receives agent1.channels=ch1 agent1.sources=avro-source1 agent1.sinks=log-sink1 # Finally, all components have been defined, tell `agent1` which one to activate
启动、测试
1 2 3 4 5 6 7 8 9 10 11
$ flume-ng agent --conf /opt/flume/conf \ -f /conf/flume.conf \ -D flume.root.logger=DEBUG,console \ -n agent1 # the agent name specified by -n agent1` must match an agent name in `-f /conf/flume.conf` $ flume-ng avro-client --conf /opt/flume/conf \ -H localhost -p 41414 \ -F /opt/hive-test.txt \ -D flume.root.logger=DEBUG, Console # 测试flume