Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

实验三chn问题总结 #54

Open
bqinmuzhi opened this issue Nov 16, 2023 · 1 comment
Open

实验三chn问题总结 #54

bqinmuzhi opened this issue Nov 16, 2023 · 1 comment

Comments

@bqinmuzhi
Copy link

1. 在进行对hdfs的操作时出现错误:
Zero blocklocations for /chn/train.csv. Name node is in safe mode.
解决方法:因为namenode节点当前处于安全模式,此时不能对hdfs上的数据进行修改,强制退出安全模式即可。
(1) 进入hadoop下:cd /usr/local/hadoop
(2) 解除安全模式:dfsadmin -safemode leave
2.在进行spark提交时出现警告:
NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
解决方法:
(1)sudo vim /etc/profile
(2)在最后添加:export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
(3)切换到普通用户haodoop,运行命令:source /etc/profile。使新的配置生效。
3. 明明资源配置足够,sparkwebUI上也正常显示worker,但仍旧反复:WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
再查看spark的日志,发现也是报错资源不足:
Master: App app-20231114194550-0009 requires more resource than any of Workers could have.
解决方法:是因为虽然所有的slave节点都正常启动了,也有worker节点,但是主机并没有worker节点。也就是说,开启的worker节点没有与master真正的连接上,所有会导致任务提交一直显示资源不够
执行代码,单独指定启动主机上的worker节点即可:
(1)主机进入spark目录:cd /usr/local/spark
(2)单独启动worker节点:sbin/start-slave.sh spark://192.168.0.150(master的公网ip):7077
(最后一个奇怪的报警,辛辛苦苦费了两天有余,从怀疑上一步的文件、怀疑代码......最后才发现了真正的原因。明明之前实验一配置都是正常的,到底是什么导致了这个bug的出现呢?)

@bqinmuzhi
Copy link
Author

14组

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant