怎么在linux上运行Kettle任务

您所在的位置:网站首页 kettle如何循环执行 怎么在linux上运行Kettle任务

怎么在linux上运行Kettle任务

2023-11-06 08:41| 来源: 网络整理| 查看: 265

在linux上运行kettle转换任务

(需要先上传kettle程序包至服务器)

任务是从Cassandra抽取数据到hdfs上,这个转换的任务请参考上一篇博客https://blog.csdn.net/oppo62258801/article/details/89501428。

考虑到放在linux运行,希望能传时间参数进去,再用linux脚本和定时去进行增量抽取。会比较方便。我也看了很多博客进行参考边查边试,可以给大家参考一下我们最后的做法。

CQL中希望传参的地方用${变量名}代替,譬如${tablename},如:

select * from testtable where id >= ${id11} and id < ${id22} ALLOW FILTERING;

(如果需要传的是字符串(中间有空格的情况),可以在${}外加单引号-> ‘${变量名}’。)

将来你传进来的参数就会传进id11和id22。

双击画图区,会蹦出来一个框框。选择命名参数,输入参数名。(如对照上面的CQL这里就填 id11)。最后点确定。

点击三角符号运行后,会出现另一个框,里面会有刚才设置的命名参数,点击启动。不出问题会顺利完成。把转换任务(.ktr文件)上传到服务器上。kettle中pan.sh脚本用来跑转换任务(.ktr),kitchen.sh脚本用来跑作业任务(.kjb)。之后需要知道两个路径:

1.pan.sh 或 kitchen.sh的路径,在kettle-版本/data-integration/下

2.你的转换任务(.ktr)的路径  

启动脚本命令用pan.sh,加上参数:

sh /你的路径/kettle-版本/data-integration/pan.sh  -norep -file=/你的路径/转换任务.ktr -param:参数名=值 -param:参数名2=值2

如:

sh /root/kettle-8.2.0.0-342/data-integration/pan.sh  -norep -file=/root/kettle-8.2.0.0-342/kettle/transfromtest0424.ktr -param:starttime='2019-03-11 00:00:00.100' -param:endtime='2019-03-12 00:00:10.100'  

运行结果

#一些报错和我们解决问题经验的建议:

2019/04/25 10:15:15 - Cassandra Input.0 - Closing connection ... org.pentaho.di.core.exception.KettleException:  All host(s) tried for query failed (tried: /10.66.1.52:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.66.1.52:9042] Cannot connect)) All host(s) tried for query failed (tried: /10.66.1.52:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.66.1.52:9042] Cannot connect))     at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.processRow(CassandraInput.java:159)     at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)     at java.lang.Thread.run(Thread.java:748) Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.66.1.52:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.66.1.52:9042] Cannot connect))     at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:232)     at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)     at com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1619)     at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1537)     at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:399)     at org.pentaho.cassandra.driver.datastax.DriverConnection.getKeyspace(DriverConnection.java:152)     at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.processRow(CassandraInput.java:156)     ... 2 more 2019/04/25 10:15:15 - Cassandra Input.0 - ERROR (version 8.2.0.0-342, build 8.2.0.0-342 from 2018-11-14 10.30.55 by buildguy) : Unexpected error 2019/04/25 10:15:15 - Cassandra Input.0 - ERROR (version 8.2.0.0-342, build 8.2.0.0-342 from 2018-11-14 10.30.55 by buildguy) : org.pentaho.di.core.exception.KettleException:  2019/04/25 10:15:15 - Cassandra Input.0 - All host(s) tried for query failed (tried: /10.66.1.52:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.66.1.52:9042] Cannot connect)) 2019/04/25 10:15:15 - Cassandra Input.0 - All host(s) tried for query failed (tried: /10.66.1.52:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.66.1.52:9042] Cannot connect)) 2019/04/25 10:15:15 - Cassandra Input.0 -  2019/04/25 10:15:15 - Cassandra Input.0 -     at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.processRow(CassandraInput.java:159) 2019/04/25 10:15:15 - Cassandra Input.0 -     at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62) 2019/04/25 10:15:15 - Cassandra Input.0 -     at java.lang.Thread.run(Thread.java:748) 2019/04/25 10:15:15 - Cassandra Input.0 - Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.66.1.52:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.66.1.52:9042] Cannot connect)) 2019/04/25 10:15:15 - Cassandra Input.0 -     at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:232) 2019/04/25 10:15:15 - Cassandra Input.0 -     at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79) 2019/04/25 10:15:15 - Cassandra Input.0 -     at com.datastax.driver.core.Cluster$Manager.negotiateProtocolVersionAndConnect(Cluster.java:1619) 2019/04/25 10:15:15 - Cassandra Input.0 -     at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1537) 2019/04/25 10:15:15 - Cassandra Input.0 -     at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:399) 2019/04/25 10:15:15 - Cassandra Input.0 -     at org.pentaho.cassandra.driver.datastax.DriverConnection.getKeyspace(DriverConnection.java:152) 2019/04/25 10:15:15 - Cassandra Input.0 -     at org.pentaho.di.trans.steps.cassandrainput.CassandraInput.processRow(CassandraInput.java:156) 2019/04/25 10:15:15 - Cassandra Input.0 -     ... 2 more 2019/04/25 10:15:15 - Cassandra Input.0 - 完成处理 (I=0, O=0, R=0, W=0, U=0, E=1) 2019/04/25 10:15:15 - test042401 - ERROR (version 8.2.0.0-342, build 8.2.0.0-342 from 2018-11-14 10.30.55 by buildguy) : 错误被检测到! 2019/04/25 10:15:15 - test042401 - 转换被检测  2019/04/25 10:15:15 - test042401 - 转换正在杀死其他步骤! 2019/04/25 10:15:15 - Carte - Installing timer to purge stale objects after 1440 minutes. 2019/04/25 10:15:15 - test042401 - ERROR (version 8.2.0.0-342, build 8.2.0.0-342 from 2018-11-14 10.30.55 by buildguy) : 错误被检测到! 2019/04/25 10:15:15 - test042402 - 完成作业项[转换] (结果=[false]) 2019/04/25 10:15:15 - test042402 - 任务执行完毕 2019/04/25 10:15:15 - Kitchen - Finished! 2019/04/25 10:15:15 - Kitchen - ERROR (version 8.2.0.0-342, build 8.2.0.0-342 from 2018-11-14 10.30.55 by buildguy) : Finished with errors 2019/04/25 10:15:15 - Kitchen - Start=2019/04/25 10:15:01.138, Stop=2019/04/25 10:15:15.520 2019/04/25 10:15:15 - Kitchen - Processing ended after 14 seconds.

A:这种情况请检查一下网络,可以ping一下,linux到cassandra链路是否通了。



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3