Thrift 0.12.0安装和 Parquet

您所在的位置:网站首页 mr版本 Thrift 0.12.0安装和 Parquet

Thrift 0.12.0安装和 Parquet

2023-11-29 18:41| 来源: 网络整理| 查看: 265

文章目录 Thrift 0.12.0 版本安装thrift 版本踩坑编译环境准备升级GCC版本Install BoostInstall libevent 编译安装 thrift 0.12.0 TroubleShootingerror: no member named 'stdcxx' in namespace 'apache::thrift'fatal error: 'openssl/opensslv.h' file not foundcomposer: No such file or directorygo.mod file not found in current directory or any parenterror: could not create '/usr/lib/python3.8': Operation not permitted'/usr/lib/php/TMultiplexedProcessor.php': Operation not permitted parquet-mr 项目编译项目导入 IdeaPom依赖的Scope修改Dev and debug parquet-toolsEnable Local ProfileFast debug parquet-tools 参考文档

新的公司的parquet-mr 版本相对于原来老的CDH版本升级了不少,为了方便看parquet相关代码,所以需要本地编译 parquet-mr 代码

Thrift 0.12.0 版本安装 thrift 版本踩坑

之前CDH版本的代码,是thrift 0.9 通吃类型,brew直接搞定,但是新公司的Thrift版本已经升级,所以想在本地安装一下对应的thrift版本。

因为thrift版是0.12版本,低版本 0.9 版本太低, thrift 对应版本 0.14.0 有些类已经发生更新,所以也编译不了。

[1] $ brew search thrift ==> Formulae thrift ✔ [email protected]

所以接下来是漫长的 Thrift 0.12.0 版本的安装过程

编译环境准备 升级GCC版本

在编译源码的时候,电脑自带的gcc版本是 4.2.1 版本太低了,顺便升级了一下gcc版本

brew install gcc@8 alias gcc='gcc-8' alias cc='gcc-8' alias g++='g++-8' alias c++='c++-8' Install Boost

boost是一个C++ Library,最好是手动安装.C++ 项目的官网看着不是太适应,直接给下载地址

https://www.boost.org/doc/libs/1_75_0/more/getting_started/unix-variants.html#easy-build-and-install

下载后,执行

./bootstrap.sh sudo ./b2 threading=multi address-model=64 variant=release stage install Install libevent

安装libvent brew install libevent

编译安装 thrift 0.12.0

在准备好上述环境后,重新尝试安装thrift

wget https://mirrors.tuna.tsinghua.edu.cn/apache/thrift/0.12.0/thrift-0.12.0.tar.gz ./configure --without-ruby PY_PREFIX=/Users/wakun/opt/anaconda3/lib/python3.8 make TroubleShooting error: no member named ‘stdcxx’ in namespace ‘apache::thrift’ src/thrift/async/TAsyncProtocolProcessor.cpp:29:55: error: no member named 'stdcxx' in namespace 'apache::thrift' void TAsyncProtocolProcessor::process(apache::thrift::stdcxx::function _return, ~~~~~~~~~~~~~~~~^ src/thrift/async/TAsyncProtocolProcessor.cpp:29:31: error: variable has incomplete type 'void' void TAsyncProtocolProcessor::process(apache::thrift::stdcxx::function _return, ^ src/thrift/async/TAsyncProtocolProcessor.cpp:30:39: error: use of undeclared identifier 'stdcxx' stdcxx::shared_ptr ibuf, ^ src/thrift/async/TAsyncProtocolProcessor.cpp:31:39: error: use of undeclared identifier 'stdcxx' stdcxx::shared_ptr obuf) { ^ src/thrift/async/TAsyncProtocolProcessor.cpp:31:76: error: expected ';' after top level declarator stdcxx::shared_ptr obuf) { ^ ; 5 errors generated.

这个错误出现的相当神奇,原始的错误信息是在 namespace 'apache::thrift' 找不到 stdcxx 一开始我以为stdcxx 是C++ 中的标准库,找不到的原因肯定是某些Lib没有安装到,后来发现不是,这个 stdcxx 不是开源的标准版本 stdcxx, 居然是 thrift 自己写的lib,就在 ./lib/cpp/src/thrift/stdcxx.h 然后下载了最新的 thrift-0.14.1 版本,源码里已经没有了这个lib。坑爹,肯定是在编译 0.12.0 源码的时候,使用了 0.14.1 的lib了。暴力的 brew uninstall thrift, 编译恢复正常~~ *

fatal error: ‘openssl/opensslv.h’ file not found libtool: compile: g++ -std=c++11 -DHAVE_CONFIG_H -I. -I../.. -I../../lib/cpp/src/thrift -I../../lib/c_glib/src/thrift -I/usr/local/include -I./src -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/local/Cellar/thrift/0.14.1/include -Wall -Wextra -pedantic -g -O2 -MT src/thrift/transport/TSSLSocket.lo -MD -MP -MF src/thrift/transport/.deps/TSSLSocket.Tpo -c src/thrift/transport/TSSLSocket.cpp -fno-common -DPIC -o src/thrift/transport/.libs/TSSLSocket.o src/thrift/transport/TSSLSocket.cpp:43:10: fatal error: 'openssl/opensslv.h' file not found #include ^~~~~~~~~~~~~~~~~~~~ 1 error generated. make[4]: *** [src/thrift/transport/TSSLSocket.lo] Error 1

在Mac安装的 openssl 需要配置环境变量,CPPFLAGS 才能引用到include 文件夹中的文件

For compilers to find [email protected] you may need to set: export LDFLAGS="-L/usr/local/opt/[email protected]/lib" export CPPFLAGS="-I/usr/local/opt/[email protected]/include" composer: No such file or directory Making all in php Making all in test composer install --working-dir=../../.. make[4]: composer: No such file or directory make[4]: *** [deps] Error 1 make[3]: *** [all-recursive] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2

因为thrift编译的时候需要用到PHP中的 composer 工具,所以需要 brew install composer

Composer: Composer is a tool for dependency management in PHP. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you.

go.mod file not found in current directory or any parent make[4]: Nothing to be done for `all-am'. Making all in go Making all in . GOPATH=`pwd` /usr/local/bin/go build ./thrift go: go.mod file not found in current directory or any parent directory; see 'go help modules' make[4]: *** [all-local] Error 1 make[3]: *** [all-recursive] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2

这个我好像是执行了一次 go mod init 就好了,具体原因不清楚

GOPATH=`pwd` /usr/local/bin/go build ./thrift go: inconsistent vendoring in /Users/wakun/Applications/thrift-0.12.0: github.com/golang/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt To ignore the vendor directory, use -mod=readonly or -mod=mod. To sync the vendor directory, run: go mod vendor error: could not create ‘/usr/lib/python3.8’: Operation not permitted /Library/Developer/CommandLineTools/usr/bin/make install-exec-hook /Users/wakun/opt/anaconda3/bin/python setup.py install --root=/ --prefix=/usr running install running build running build_py running build_ext running install_lib creating /usr/lib/python3.8 error: could not create '/usr/lib/python3.8': Operation not permitted make[4]: *** [install-exec-hook] Error 1 make[3]: *** [install-exec-am] Error 2 make[2]: *** [install-am] Error 2 make[1]: *** [install-recursive] Error 1 make: *** [install-recursive] Error 1

编译好的python包默认安装包目录是 /usr, 我们需要把安装包安装到我们需要的目录下

./configure --without-ruby PY_PREFIX=/Users/wakun/opt/anaconda3/lib/python3.8

https://cwiki.apache.org/confluence/display/THRIFT/ThriftInstallation

Please be aware that the Python library will ignore the --prefix option and just install wherever Python’s distutils puts it (usually along the lines of /usr/lib/pythonX.Y/site-packages/). If you need to control where the Python modules are installed, set the PY_PREFIX variable. (DESTDIR is respected for Python and C++.)

‘/usr/lib/php/TMultiplexedProcessor.php’: Operation not permitted /usr/local/bin/ginstall -c -m 644 lib/TMultiplexedProcessor.php '/usr/lib/php/' ginstall: cannot create regular file '/usr/lib/php/TMultiplexedProcessor.php': Operation not permitted make[4]: *** [install-phpDATA] Error 1 make[3]: *** [install-am] Error 2 make[2]: *** [install-recursive] Error 1 make[1]: *** [install-recursive] Error 1 make: *** [install-recursive] Error 1

这个错误同上,需要设置 PHP_PREFIX 参数来解决

parquet-mr 项目编译

编译命令: mvn clean install -DskipTests=true

项目导入 Idea

parquet-mr 项目还是相当奇葩的,里面的部分Java代码是通过其他代码生成的,所以直接将项目导入IDEA中的时候,是看不了的。需要把下面这些代码加入到CLASSPATH中

$ find . -name generated-src ./parquet-common/target/generated-src ./parquet-encoding/target/generated-src ./parquet-column/target/generated-src Pom依赖的Scope修改

另外当前 1.11.1 version版本的代码, parquet-hive-storage-handler 模块下编译会报错,这个模块下的main下代码,引用了其他模块下的代码, scope却是test, 这个也是有些不走寻常路~~

org.apache.parquet parquet-hive-binding-factory ${project.version} org.apache.parquet parquet-hive-binding-interface ${project.version} Dev and debug parquet-tools Enable Local Profile

在进行Local Debug Parquet-tools 的时候,默认会报错

java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

因为项目中有一个 hadoop.scope properties 属性,默认为provided,开启项目local profile,此时 hadoop.scope = compile, 这样就可以加载到Hadoop的包了

Fast debug parquet-tools

为了方便Debug parquet-tools, 修改 parquet-tools/src/main/java/org/apache/parquet/tools/Main.java 如下

public static void main(String[] args2) { String[] args = {"meta", "/Users/wakun/Downloads/part-00000-93fb1ee1-5d40-4881-ba50-be3d0db8431a-c000.snappy.parquet"}; Main.out = System.out; 参考文档

thrift mac os 安装



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3